16.2.22 Transferring sequencing data to WEXAC for analysis via UTAP

Analyzing sequencing data via UTAP requires it to be placed in the Collaboration directory of a laboratory having an account on WEXAC (Weizmann Institute’s EXAscale) Cluster.

In order to run UTAP, your laboratory must have -

1. An account on Wexac

2. Sufficient free storage space (more than 400Gb)

Requirements 1 and 2 can be set up by your department secretary or administrator.

3. A "Collaboration" folder within your lab’s directory, with read and write permissions for the LSCF (Life Science Core Facility) Bioinformatics Unit. Must be set up by the computing center (contact hpc@weizmann.ac.il). 

Sequencing data may be transferred to Wexac -

  • From your local computer by mapping (mounting) a network drive followed by copying and pasting \ dragging and dropping to the mapped network Collaboration folder.

  • From a Unix \ Linux server via command line interface and wget.

Network drive transfer

Assuming you have access to your lab's collaboration folder you will need to mount the Collaboration folder to your computer as explained here.

Traverse to the Collaboration folder.

Best practice is to create a new folder for your sequencing data.

Copy and paste \ drag and drop the fastq files (possibly compressed into 1 file) from your local computer to your lab's Collaboration folder.

The data will most likely be compressed in gz format so you will need to extract it before analyzing with UTAP (you may use your computer built-in extraction application or one such as 7zip).

Command line-based transfer\download

Traverse to your lab's Collaboration folder on WEXAC using cd (change directory), for example -

/home/labs/<lab username>/Collaboration

Best practice is to create a new folder for your data using mkdir (make directory), for example -

mkdir <meaningFullName>

Traverse into the newly created directory - 

cd <meaningFullName>

Since the data that is analyzed in UTAP are fastq files, please paste the link detailed in your email - 

 wget -nH --cut-dirs=1 -r --reject='index.html*' --exclude-directories=/fastq/<runID>/RawData --no-parent --no-check-certificate https://stefan.weizmann.ac.il/users/<shortcut-userID>/<userID>/<runID>/ (where shortcut-userID consists of the 2 first letters of userID)

The data will most likely be compressed in gz format so you will need to extract it before analyzing with UTAP -

gunzip <filename.gz>

 

Proceed to analyze the data using the UTAP pipeline