Skip to content

A phylogenetic analysis from scRNA-seq data using both expression levels and identified SNVs

Notifications You must be signed in to change notification settings

bioDS/phyloRNAanalysis

Repository files navigation

Cancer phylogenetics using single-cell RNA-seq data

This repository contains code to fully replicate the analysis of Cancer phylogenetics using single-cell RNA-seq data (Moravec at al. 2021). Alternatively, it can be used to perform a similar analysis on a new dataset.

Note that the analysis assumes a relatively uniform cell populations, otherwise the discretization method using Highest Density Interval will not work.

Requirements

System requirements

  • Linux operation system
  • at least 30 GB RAM
  • about 400 GB of free space for intermediate files and results

Required software

R, python3, Cellranger, bamtofastq, GATK, VCFtools, IQtree, BEAST2

R packages:

phyloRNA, beter, data.table, devtools

Python packages:

pysam

Required files:

Original data published at GEO database under the accession number GSE163210.

Human reference genome GRCh38v15, annotation and known variants.

Code from this repository.

Running the analysis

Once you have installed required software and prepared your data, navigate into the analysis directory and type:

Rscript run.r

After few days, the analysis should finish.

Processed files

Pre-processed fasta files, trees and tests of phylogenetic clustering can be seen in the processed_files branch. These files are tracked with Git Large File Storage (LFS) extension.

Detailed instruction

Need help?

If anything is unclear or you need help with the analysis, raise an issue.

About

A phylogenetic analysis from scRNA-seq data using both expression levels and identified SNVs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published