Skip to content

Workflow leveraging the scanpy-scripts package to run Scanpy in a Nextflow workflow

Notifications You must be signed in to change notification settings

ebi-gene-expression-group/scanpy-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Run the steps of the Scanpy workflow

This is a Nextflow Workflow leveraging the scanpy-scripts package to run individual steps of the Scanpy workflow.

Setup

Conda/ Bioconda

Workflow dependencies are managed via Conda and Bioconda, so you'll need to set that up, see instructions here.

NXF_CONDA_CACHEDIR

This environment variable is passed to Nextflow to determine the storage locations for environments, you must set it before running the workflow:

export NXF_CONDA_CACHEDIR=/path/to/envs 

Nextflow

Obviously you'll need Nexflow itself. If you don't have it already you can install via Conda:

conda install nextflow

You may well want to do this within a Conda environment you create for the purpose.

Run the workflow

Inputs

Expected inputs are:

  • A .zip file containing a single directory with the files matrix.mtx, barcodes.tsv and genes.tsv - i.e. the sparse MTX format.
  • A GTF file with gene IDs specifying the features you wish to filter by.

Parameters

By default, the workflow will run using parameters from the default configuration file, and with the 'local' executor- i.e. one process at at time.

You can copy the default configuration, edit the Scanpy and other parameters, and provide it to Nextflow to override any of the settings. See the Nexflow documentation for executor settings.

Execution

The workflow can be run directly from the repository:

nextflow run -config <your nextflow.config> ebi-gene-expression-group/scanpy-workflow --matrix <mtx zip> --gtf <gtf> --resultsRoot <final results dir>

This will download the workflow, create any necessary environments, and run the workflow with the specified innputs. Future executions will use a cached copy of the pipeline, should you wish to update the code in future, you can do so like:

nextflow pull ebi-gene-expression-group/scanpy-workflow

Outputs

Outputs will be placed in the directory defined as WORKFLOW_RESULTS_DIR under 'env' in nextflow.config ('results' by default). Outputs include:

  • Cluster definitions
  • t-SNE embeddings
  • UMAP coordinates
  • Marker genes

About

Workflow leveraging the scanpy-scripts package to run Scanpy in a Nextflow workflow

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published