Material and methods

Genome information was download from Ensembl. Samtools [1] and Picard [2] were used to index genome sequences. Agat [3] was used to correct common issues found in Ensembl genome annotation files. Salmon [5] was used to preduce and index a decoy-aware gentrome based on both DNA and cDNA sequences.

Raw fastq files were trimmed using Fastp [4]. Salmon [#salmonpaper] performed the pseudo-mapping and estimation of transcripts abundance. The count aggregation was performed with tximport [#tximportpaper].

The whole pipeline was powered by Snakemake [7].

This pipeline is freely available on Github, details about installation usage, and resutls can be found on the Snakemake workflow page.

[1]	Li, Heng, et al. "The sequence alignment/map format and SAMtools." bioinformatics 25.16 (2009): 2078-2079.

[2]	McKenna, Aaron, et al. "The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data." Genome research 20.9 (2010): 1297-1303.

[3]	Dainat J. AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format. (Version v0.7.0). Zenodo. https://www.doi.org/10.5281/zenodo.3552717

[4]	Chen, Shifu, et al. "fastp: an ultra-fast all-in-one FASTQ preprocessor." Bioinformatics 34.17 (2018): i884-i890.

[5]	Patro, Rob, et al. "Salmon provides fast and bias-aware quantification of transcript expression." Nature methods 14.4 (2017): 417-419.

[6]	Love, Michael I., Charlotte Soneson, and Mark D. Robinson. "Importing transcript abundance datasets with tximport." Dim Txi. Inf. Rep. Sample 1.1 (2017): 5.

[7]	Köster, Johannes, and Sven Rahmann. "Snakemake—a scalable bioinformatics workflow engine." Bioinformatics 28.19 (2012): 2520-2522.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

material_methods.rst

material_methods.rst

Material and methods

Files

material_methods.rst

Latest commit

History

material_methods.rst

File metadata and controls

Material and methods