Skip to content

Latest commit

 

History

History
37 lines (31 loc) · 2.8 KB

material_methods.rst

File metadata and controls

37 lines (31 loc) · 2.8 KB

Material and methods

Genome information was download from Ensembl. Samtools [1] and Picard [2] were used to index genome sequences. Agat [3] was used to correct common issues found in Ensembl genome annotation files. Salmon [5] was used to preduce and index a decoy-aware gentrome based on both DNA and cDNA sequences.

Raw fastq files were trimmed using Fastp [4]. Salmon [#salmonpaper] performed the pseudo-mapping and estimation of transcripts abundance. The count aggregation was performed with tximport [#tximportpaper].

The whole pipeline was powered by Snakemake [7].

This pipeline is freely available on Github, details about installation usage, and resutls can be found on the Snakemake workflow page.

[1]Li, Heng, et al. "The sequence alignment/map format and SAMtools." bioinformatics 25.16 (2009): 2078-2079.
[2]McKenna, Aaron, et al. "The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data." Genome research 20.9 (2010): 1297-1303.
[3]Dainat J. AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format. (Version v0.7.0). Zenodo. https://www.doi.org/10.5281/zenodo.3552717
[4]Chen, Shifu, et al. "fastp: an ultra-fast all-in-one FASTQ preprocessor." Bioinformatics 34.17 (2018): i884-i890.
[5]Patro, Rob, et al. "Salmon provides fast and bias-aware quantification of transcript expression." Nature methods 14.4 (2017): 417-419.
[6]Love, Michael I., Charlotte Soneson, and Mark D. Robinson. "Importing transcript abundance datasets with tximport." Dim Txi. Inf. Rep. Sample 1.1 (2017): 5.
[7]Köster, Johannes, and Sven Rahmann. "Snakemake—a scalable bioinformatics workflow engine." Bioinformatics 28.19 (2012): 2520-2522.