NOT your grandma’s ANALYSIS PIPELINE
For use with SRSLY NGS Library preparation kitS
This pipeline is for the post-processing of FASTQ files generated by sequencing data resulting from libraries prepared with SRSLY (works with other library prep data too). Starting from raw FASTQ files, this pipeline will trim, map, mark duplicates, and provide an insert length distribution plot as well as a file containing summary stats for each analyzed library.
To generate UMI-deconvoluted FASTQs from BCL files use SRSLYumi.
STEP 1: DOWNLOAD SRSLYRUN
Install SRSLYRUN with:
pip3 install srslyrun
This will also install the SRSLYumi pipeline required for UMI deconvolution. The package is compatible with Python 3 and can be installed in a virtual environment if necessary. You should also have mamba installed on your system.
MANUAL
Download the SRSLYrun code from GitHub
STEP 2: RUN SRSLYRUN
Run the following command on raw fastq data. If you are working with UMI aware data, please run the SRSLYumi pipeline or perform manual deconvolution starting with the .bcl files.
srsly runsamples --reference {path/to/reference/genome} --liblist {lib1, lib2, lib3} OR --libfile libs.txt
where lib1, lib2 are library IDs
Optional arguments:
--fastqdir {dir containing fastqs to analyze} --resultsdir {output dir}
If not specified, both will default to the current directory
This command will provide more information about usage:
srsly runsamples --help
STEP 3: ANALYZE DATA
Output files will be in the directory specified by --resultsdir above or in the current working directory, with individual directories for each library.
Output files:
Trimmed fastqs
Mapped duplicate bams
Samtools flagstat output
An insert length distribution plot
A summary stats file
UMI aware runs will also have
Consensus reads for each UMI with fgbio
A umi.bam file with corrected UMIs and UMI aware duplicate marking
For additional information about running the pipeline, contact technicalsupport@claretbio.com
Visit Github page for references.