Skip to content

Latest commit

 

History

History
141 lines (94 loc) · 5.06 KB

usage.rst

File metadata and controls

141 lines (94 loc) · 5.06 KB

Gustave Roussy special launching steps

If you belong to Gustave Roussy, and use flamingo computing cluster, please use the following command lines and ignore the rest of this documentation.

# Activate conda environment
conda activate /mnt/beegfs/pipelines/unofficial-snakemake-wrappers/shared_install/snakemake_v8.4.8

# Deploy workflow with the version of your choice
snakedeploy deploy-workflow \
    https://github.com/tdayris/fair_genome_indexer . \
    --tag <version>

# Edit your configuration file if needed
vim config/config.yaml

# Use shared genome.csv file to avoid downloading all the available resources from the web
rsync -cvrhP /mnt/beegfs/pipelines/unofficial-snakemake-wrappers/genomes.csv config/genomes.csv

# Run snakemake command
snakemake --profile '/mnt/beegfs/pipelines/unofficial-snakemake-wrappers/profiles/slurm-web/'

With version being the latest available version of this pipeline. Select your version here

Step 1 : Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Mamba package manager (a drop-in replacement for conda). If you have neither Conda nor Mamba, it can be installed via Mambaforge. For other options see: mamba-org.

Given that Mamba is installed, run

mamba create -c conda-forge \
             -c bioconda \
             --name snakemake \
             snakemake \
             snakedeploy \
             mamba \

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via the following command:

conda activate snakemake

Step 2 : Deploy workflow

Given that Snakemake and Snakedeploy are installed and available (see Step 1), the workflow can be deployed as follows.

First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory.

Second, run:

snakedeploy deploy-workflow \
            https://github.com/tdayris/fair_genome_indexer . \
            --tag <version>

Where <version> is the latest available verison.

Snakedeploy will create two folders workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs. Later, when executing the workflow, Snakemake will automatically find the main Snakefile in the workflow subfolder.

Third, consider to put this directory under version control, e.g. by managing it via a (private) Github repository

Step 3 : Configure the workflow

Edit the file config.yaml and genomes.csv according to the description available in the config/README.md file.

Step 4: Run workflow

Given that the workflow has been properly deployed and configured, it can be executed as follows.

Fow running the workflow while deploying any necessary software via conda (using the Mamba package manager by default), run Snakemake with:

snakemake --cores all --software-deployment-method conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options, e.g. for cluster and cloud execution, see Snakemake documentation.

Step 5 : Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser via:

snakemake --report report.zip

The resulting report.zip file can be passed on to collaborators, provided as a supplementary file in publications, or uploaded to a service like Zenodo in order to obtain a citable DOI.