Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Pipeline to perform DESeq2 analysis from a matrix containing raw counts per gene per sample, in at least two conditions, with at least two replicates per condition. 

Pipeline website: http://utap.wexac.weizmann.ac.il

Setting up a new analysis

The transcriptome pipelines run on the Wexac cluster.  In order to run a new transcriptome analysis, your fastq files must be in your Collaboration folder on Wexac, in the correct structure (UTAP requirements & description) .     

 Before running the pipeline, prepare your counts matrix file and transfer it to the Collaboration folder.

...

  1. The counts matrix file should be in xlsx csv or txt (tab delimited file) format.
  2. The matrix should contain the genes as rows (each row is a gene)  and the samples as columns (each column, other than the first,  is a sample). The first column should contain the gene symbols.
  3. When creating the counts matrix in excel, make sure to initially format the cells in the genes column to be Text (right-click on the column and select Format cells) to ensure that all gene symbols, including those that look like dates, retain the names that you enter. Unfortunately, the default General format in excel causes gene symbols that look like dates (e.g. SEPT6,  MARCH5) to be treated as customized dates (with format d-mmm - 6-Sep, 5-Mar for our examples), and sometimes produces duplicated names. (Note that subsequently changing the format to Text does NOT bring back the names that were originally entered; instead, they are converted to numbers!). HGNC has committed to changing such names but it might take some time. For explanations on how to manipulate excel files see EXCEL tips.

...

Choose your counts matrix file using 'Input folder' and click on run DESeq2 in order to identify differentially expressed genes with the DESeq2 package as described in the DESeq2 manual.

All of the samples from the counts matrix file will be parsed to the popup "choice box" .

Fill in you desired report folder name in the relevant field.

When choosing to run DESeq2 (with the 'DESeq2 run'), at least  two categories must be created (by filling in the category names and dragging the relevant samples). Additional explanations can be found in 16.2.22 UTAP: Transcriptome from RNA-Seq, MARS-Seq or SCRB-Seq.

LINK:

Transcriptome pipeline for Weizmann Institute users:  http://utap.wexac.weizmann.ac.il

...