NOT your grandma’s ANALYSIS PIPELINE

For use with SRSLY NGS Library preparation kitS

This pipeline is for the post-processing of FASTQ files generated by sequencing data resulting from libraries prepared with SRSLY (works with other library prep data too). Starting from raw FASTQ files, this pipeline will trim, map, mark duplicates, and provide an insert length distribution plot as well as a file containing summary stats for each analyzed library. 
To generate UMI-deconvoluted FASTQs from BCL files use SRSLYumi.

SRSLYRUN software workflow. Raw FASTQ or BCL files are processed to generate a comprehensive data output. Steps in teal indicate processing of sequencing data without UMI and steps in yellow indicate processing of UMI-aware sequencing data. SRSLYumi must be run prior to processing UMI-aware data.

STEP 1: DOWNLOAD SRSLYRUN

Install SRSLYRUN with: 

pip3 install srslyrun

This will also install the SRSLYumi pipeline required for UMI deconvolution. The package is compatible with Python 3 and can be installed in a virtual environment if necessary. You should also have mamba installed on your system. 

MANUAL

Download the SRSLYrun code from GitHub

STEP 2: RUN SRSLYRUN

Run the following command on raw fastq data. If you are working with UMI aware data, please run the SRSLYumi pipeline or perform manual deconvolution starting with the .bcl files.

srsly runsamples --reference {path/to/reference/genome}  --liblist {lib1, lib2, lib3} OR --libfile libs.txt

where lib1, lib2 are library IDs

Optional arguments:

 --fastqdir {dir containing fastqs to analyze} --resultsdir {output dir}  

If not specified, both will default to the current directory 

This command will provide more information about usage:

 srsly runsamples --help

STEP 3: ANALYZE DATA

Output files will be in the directory specified by --resultsdir above or in the current working directory, with individual directories for each library.

 Output files:

  • Trimmed fastqs

  • Mapped duplicate bams

  • Samtools flagstat output

  • An insert length distribution plot

  • A summary stats file

 UMI aware runs will also have

  • Consensus reads for each UMI with fgbio

  • A umi.bam file with corrected UMIs and UMI aware duplicate marking

For additional information about running the pipeline, contact technicalsupport@claretbio.com

Visit Github page for references.