Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This pipeline is available in https://utap.wexac.weizmann.ac.il/

Before you start:

This pipeline runs on the Wexac cluster. 
Please prepare the following in advance:

  1. An account (userID) on Wexac, via your department administrator.
  2. A "Collaboration" folder within your lab folder on Wexac, with read and write permission for Bioinformatics Unit staff. This must be set up by the computing center (hpc@weizmann.ac.il).
  3. Sufficient free storage space on Wexac (> 400Gb), via your department administrator.

In order to run a new transcriptome ATAC-seq analysis, you must first transfer demultiplexed sequencing data (fastq files) to your Collaboration folder. Within the Collaboration folder, a directory structure will be created supporting with outputs of the transcriptome analysis setup described below.

Setting up a new analysis

...

Pipeline steps and associated tools:

  1. Quality controlReads trimming: Reads are quality trimmed using cutadapt. In this process primers corresponding to the TruSeq protocol are removed (output is in folder 1).
  2. Quality control: Reads quality control is evaluated using FastQC (in output folder 2), and a report file, containing quality reports for all of the samples, is generated using multiQC (in output folder 3).
  3. Mapping to genome: The quality trimmed paired-end reads are mapped to Mouse/Human genomes using Bowtie2 (output is in folder 4).
  4. Alignment filtering: Following the alignment, mitochondrial genes are removed from the analysis (using the grep command). Duplicated reads are removed using picard-tools. The remaining unique reads are indexed and sorted using samtools index and samtools sortGenerate statistics . Statistics on the alignment is generated using flagstat .Visualization in graphs: The analyzed reads are graphically visualized using ngsplot(output is in folder 5).
  5. Select nucleosome-free fragments: fragments of length <120bp are selected (output is in folder 6).
  6. Visualization in graphs: The analyzed reads are graphically visualized using ngsplot.
  7. Peak calling: Peaks are called using MACS2.

...