16.2.22 UTAP: SCRB-Seq analysis pipeline

SCRB-Seq analysis in UTAP pipelines

SCRB-Seq analysis can be done in UTAP in two steps. First, demultiplex from BCLs and second, run a transcriptome analysis. Available options are:

1A. Demultiplexing from BCL with or without pools

1B. Demultiplexing from BCL with dual-index

2. Transcriptome SCRB-Seq analysis

 

  1. Demultiplexing from BCL

Demultiplexing SCRB-Seq libraries from BCLs to fastq files can be done on either pooled or non-pooled data.

The demultiplexing pooled data is done in two steps, the first demultiplexing using the pools index, and the second, demultiplexing using the samples index within each pools.
FastQC analysis is run after each demultiplexing step, and is summed in a multiQC report.

After the demultiplexing, the separated fastq files are being modified (R1 read is being switched with R2 read) in order to prepare the files for transcriptome SCRB-Seq analysis, which is based on MARS-Seq analysis.

Setting up a new analysis

After selecting SCRB-Seq as the sample preparation protocol you will get the following form:

Fill in the project name and select the input folder which contains your BCL-files,

The output will be written to the  /home/labs/USER_LAB/Collaboration/your project name + runid 

If you wish to delete the BCL-files after demultiplexing is completed, set “Delete BCL files” field to yes.

Fill in the lengths if the UMI- barcode and the barcode that you used to prepare your libraries.

 

Demultiplex non-pooled data

Fill in the table form (shown below) with your samples names and their indexes. If you have dual-index libraries, press on Add Dual-Index button and the table will be extended by an additional column (Index2) in which you can fill in the second Index. 

The input data in this table will be use to generate the samplesheet the demultiplexing analysis.

The output folder will look as follows:

1_pools

2_fastqc

3_multiQC

 

Demultiplex pooled data

If you have pooled data, press the “Add pools” button and the following form will open:

Fill in your pools names and their indexes, if you have dual-index libraries , press on “Add Dual-Index” button and the table will be extended by an additional column (Index2) in which you can fill in the second Index. 

The input data in this table will be use to generate the samplesheet for the first step demultiplexing done on the pooled data.

In the second table (shown below) fill in the samples names and their indexes and for each sample fill in the pool name that contain the sample.

 

 

The input data in this table will be use to generate the samplesheet for the second step demultiplexing analysis which is done on the samples.

The output folder will look as follow:

1_pools (fastq file, where R1 contains Umi+barcode, R2 contains the cDNA insert)

2_fastqc

3_multiQC

3_samples (fastq file, where R1 contains the cDNA insert, R2 contains Umi+barcode)

4_fastqc

5_multiQC

 

After filling all the required information, press the “Check sample sheet” button in order to validate the samples names and indexes.

If the validation is successful, press on “run analysis”.

 

  1. SCRB-Seq transcriptome analysis with UTAP:

After the demultiplexing is completed you can proceed the analysis with the pipeline “transcriptome SCRB-Seq” (setup form for running the pipeline is similar to transcriptome – MARS-Seq).
The steps performed by the pipeline - 

  1. Trim adapter sequences

  2. Fastqc for quality control of the samples will be run in parallel to the steps described

  3. Map reads to the selected reference genome

  4. Add UMI and gene information to the reads 

  5. Quantify gene expression by counting reads

  6. Count UMI's and correct for clashes

  7. Detect Deferentially Expressed (DE) genes for a model with a single factor