16.2.22 UTAP: FAQ - (done)
Frequently Asked Questions
Setup
What do I need to prepare before setting up a UTAP run?
In order to run a UTAP analysis at the Institute, you must first set up an environment on the wexac cluster. As noted therein, there are in-house and external options for providing sequenced data:
Sequence your samples using the NextSeq sequencers at the Life Sciences Core Facilities Sandbox, which produces structured fastq files, and then upload and analyze those files using UTAP - OR
Upload the structured fastq files from a different source directly to UTAP
Analyses
Please describe UTAP’s transcriptome analysis pipeline.
Have a look at our demo site’s RNA-seq and MARS-seq Bioinformatics pipeline methods portion of their respective report.html output files (which will be upgraded to be in sync with UTAP Version 1.0.9). Note that this information is included in the report.html output file for all UTAP runs, and discussed in detail in our paper (Kohen et al, as cited below).
2. Can I change the thresholds for detecting differentially expressed genes?
Yes, the software provides this option, as described here, which many experienced users find helpful. Another approach is to post-process the DESeq_all_results.xls output file and apply your own thresholds.
3. How are time-course experiments analyzed?
The current DESeq2 model does not support time course analyses. However, additional downstream analyses or packages like maSigPro can be applied to UTAP’s output.
Results/Reports
Where do I find resulting gene counts and DESeq2 analysis tables?
DESeq2 analysis results are found in the report.html and DESeq_all_results.txt (or DESeq_all_results.xls if requested) files in the run-based-named output folder’s various reports subfolders under /home/labs/<lab_name>/Collaboration/<output_folder>/
MARS-Seq analysis results are available in the 10_reports sub-folder
RNA-Seq results are found in the 4_reports sub-folder
A link to the relevant report file is also provide in the UTAP site's results page for each completed analysis.
2. Where do I see the list of differentially expressed genes?
The list of DEGs can be filtered out and extracted from the complete gene lists provided in the DESeq_all_results.txt, DESeq_all_results.xls, and report.html files described above
3. Where can I see which parameters were used for my run (i.e genome version, DESeq2 groups, thresholds)?
The version of all of the tools used in the analysis is detailed in the report.html file, in its Bioinformatics pipeline methods section.
The version of all of the packages (including DESeq2) used to analyse the differentially expressed genes are detailed in the sessionInfo.txt file found in the output folder’s relevant reports subfolder
Thresholds used in the DEGs analysis are detailed in the report.html file, in its Differential Expression Analysis section.
MARS-Seq parameters are provided in the sessionInfo.txt file found in the output folder’s relevant 10_reports subfolder
RNA-seq parameters are provided in the report.html file found in the output folder’s relevant 4_reports subfolder
The different DESeq2 groups are provided in the phenodata file in the output folder, or in the relevant sample description .csv file in the reports folder.
DESeq2 comparisons are provided in the relevant comparisons .csv file in the reports folder.
4. How do I download my fastq files from a sequencing run?
Upon completion of each sequencing run, you will receive a customized email describing exactly where your files reside on stefan, and your retrieval options including using wget from wexac or another Linux machine
5. Is there a way to get the .bam files from the UTAP run? I see that it was deleted from the results folder, and I'd like to view it in IGV.
You can retrieve them from the 4_mapping sub-folder.
6. How do I save resulting UTAP reports from its website to my desktop?
Note that UTAP reports that are shown on the web are also available in your Collaboration folder on wexac. There are a number of options to save a copy on your PC.DESeq2
While viewing the relevant report.html file, right-click and choose the Save as… option (or press Ctrl+S) and then browse to find the direction where you wish to save the file. You will then be able to view it locally in your browser by clicking on it OR
Use an sftp client (like bitvise or WinSCP) or what your IT department recommends to access your directory on wexac and drag/drop the relevant files to your desktop
General
How should I cite UTAP in my paper? Please cite our manuscript: (Kohen et al) UTAP: User-friendly Transcriptome Analysis Pipeline. BMC Bioinformatics 2019, 20(1):154 (PMID: 30909881)
Are UTAP runs free? Yes
Can UTAP run on bacteria genomes? Currently, UTAP does not run on bacteria genomes.
How can one add a genome which is not listed? Please contact our team at utap@weizmann.ac.il
Advanced
What are the ramifications of re-running DESeq2? Rerunning DESeq2 starts from the beginning and does not use any of the results from the previous run(s)
How does UTAP handle normalizations and UMI correction?
DESeq2 generates two counts matrices from htseq, one without UMI correction called countsMatrix.txt, and another with UMI correction called countsCorrectedMatrix.txt.
DESeq2 analysis (including normalization) is done on both of the matrices, and two results folders are generated respectively.
The output folder (e.g. report_output_20200427_154328) contains the DESeq2 results without UMI correction. The umi output folder (e.g. report_umi_counts_output_20200427_154328), contains DESeq2 results with the UMI correction.
3. Are there any advantages to using merged genomes in UTAP?
Yes. Note that counting of UMIs (critical for the MARS-seq library) takes into account that reads should be uniquely aligned to a genome; if a read aligns equally well to two different genomes, it is not counted on behalf of a gene. So for example, if you’ve aligned your fastq files to both human and mouse genomes, then using UTAP with the merged human and mouse genome will provide the unique read or UMI counts to both human and mouse genes.