Skip to content

mmpust/raspir_evaluation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 

Repository files navigation

Performance evaluation of raspir

R-scripts, Bash-scripts and input tables.

List of contents

Performance_evaluation/

Two R scripts for data analysis

  • raspir_performance_evaluation_I.R (data analysis, mock community simulation)
  • raspir_performance_evaluation_II.R (data analysis, real-world dataset)


Performance_evaluation/simulation_samples

exampleRun_mockCommunity_seed222.csv (input file, heatmap visualisation)

  • Species are given as row and run parameters as column names
  • Column names starting with "raspir_" show results obtained when incorporating raspir into the alignment procedure.
  • Column names starting with "normal_" show the alignment results without raspir.
  • The numerical data at the end of column names (c030, c050 ...) refers to the number of short reads that was selected for the rare species of the mock community.
  • Explanation of numerical outcome:
    0: True negative species
    1: True positive rare species
    2: False positive species
    3: True positive core species

raspir_run_statistics.csv (data analysis, clinimetric properties)

  • Shows all the numerical data obtained for simulations run with 20 different seeds set for the random read generator
  • Two different alignment tools were used (Bowtie 2 and BWA)


Performance_evaluation/biological_samples/

download_fastq.sh (bash script for downloading biological samples with sra-explorer)


Performance_evaluation/biological_samples/alignment_output

rawCounts_merged_samples_SRR7049258 (count table, per sample and species with raw read counts)
RPMM_merged_samples_SRR7049258 (count table, per sample and species with normalised read counts, RPMM: genome length and sequencing depth)


Performance_evaluation/biological_samples/raspir_output

Contains all data tables obtained with raspir


MockCommunity/

A) Compressed .FASTA files of the core and rare species of the mock community
Core species
eubacterium_sulci_ref.fasta.gz
pseudomonas_aeruginosa_ref.fasta.gz
rothia_mucilaginosa_ref.fasta.gz
streptococcus_salivarius_ref.fasta.gz

Rare species
escherichia_coli_ref.fasta.gz
staphylococcus_aureus_ref.fasta.gz
streptococcus_equinus_ref.fasta.gz
streptococcus_mitis_ref.fasta.gz
streptococcus_thermophilus_ref.fasta.gz
streptococcus_pneumoniae_ref.fasta.gz

About

R-scripts, metadata and input files

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published