Skip to content

Scripts and containers to run the variant callers originally used in ONCOLINER

License

Notifications You must be signed in to change notification settings

EUCANCan/variant-callers

Repository files navigation

ONCOLINER - Variant callers

R. Martín et al., “ONCOLINER: A new solution for monitoring, improving, and harmonizing somatic variant calling across genomic oncology centers,” Cell Genomics, vol. 4, no. 9. Elsevier BV, p. 100639, Sep. 2024. doi: 10.1016/j.xgen.2024.100639

This repository contains the scripts to run the variant callers used originally in ONCOLINER. The variant callers are executed from Bash scripts that use Singularity containers. The scripts are located in the executable_scripts/ folder of this repository. The containers references are available in the variant callers list below.

The scripts for running the variant callers are Bash scripts that can be executed directly from the command line in almost any Unix-based system. The only dependency is Singularity (singularity-ce version +3.9.0). The scripts are optimized for running in HPC environments without root privileges.

Table of Contents

Variant callers list

Variant caller Variant types Version Singularity containers License Notes
cgpCaVEManWrapper SNV 1.6.0 oncoliner_cgpcavemanwrapper:1.16.0 AGPL-3.0 cgpPindel must be executed first
MuSE SNV 2.0 oncoliner_muse:2.0 GPL-2.0 Does not support CRAM
Shimmer SNV oncoliner_shimmer:latest Custom Does not support CRAM
Mutect2 (from GATK) SNV/Indel 4.2.6.1 oncoliner_gatk:4.2.6.1 Apache 2.0
SAGE SNV/Indel 3.0 oncoliner_sage:3.0 GPL-3.0
Strelka2 SNV/Indel 2.9.10 oncoliner_strelka:2.9.10 GPL-3.0
cgpPindel Indel 3.9.0 oncoliner_cgppindel:3.9.0 AGPL-3.0
SvABA Indel/SV 1.1.0 oncoliner_svaba:1.1.0 GPL-3.0
BRASS SV 6.3.4 oncoliner_brass:6.3.4 AGPL-3.0
Delly SV 1.1.6 oncoliner_delly:1.1.6 BSD-3
GRIDSS2 (with GRIPSS) SV 2.13.2 oncoliner_gridss:2.13.2 / GRIPSS JAR GPL-3.0
Manta SV 1.6.0 oncoliner_manta:1.6.0 GPL-3.0

Downloading the variant callers

Downloading Singularity containers (using ORAS) does not require root privileges. For downloading any of the Singularity containers provided in this repository, you can use the following command:

singularity pull <variant_caller_name_version>.sif oras://ghcr.io/eucancan/<container_name:tag>

It is important that the container is named after the script that executes it. For example, the script executable_scripts/muse_2_0.sh requires the singularity container to be named muse_2_0.sif.

WARNING. Your institution may not allow you to download files directly from computing nodes. If that is the case, you will need to download the container in a different machine and then copy it to the computing node. For example, you could download the container in your local machine and then copy it to the computing node using scp:

scp <variant_caller_name_version>.sif <username>@<hostname>:<path_to_singularity_containers_storage_dir>

Executing the variant callers

Running Singularity containers does not require root privileges. All the scripts to execute the variant callers are located in the executable_scripts/ folder of this repository. The scripts are named after the variant caller they execute and its version. For example, the script to execute MuSE v2.0 is located in executable_scripts/muse_2.0.sh.

Parameters

All the scripts require the following parameters to be passed in the following order:

$WORKING_DIR # path to working directory
$OUTPUT_DIR # path to output directory
$EXTRA_DATA_DIR # path to extra data directory
$REF_VERSION # reference version (i.e. 37)
$NORMAL_SAMPLE # path to normal sample SAM/BAM/CRAM file
$TUMOR_SAMPLE # path to tumor sample SAM/BAM/CRAM file
$FASTA_REF # path to reference FASTA file
$NUM_CORES # number of cores to use
$MAX_MEMORY # maximum memory to use (in GB) (i.e 8)

Extra data

Some variant callers require extra data to be executed. The extra data required by each variant caller is available in the required_extra_data/ folder of this repository. If you were running the variant caller from the root of this repository, you could use the following command to set the $EXTRA_DATA_DIR environment variable:

export EXTRA_DATA_DIR=required_extra_data

Note: Due to size limitations, some files are not available in this repository and need to be downloaded from external sources. For these cases, a file with the same name but ending with .download will be present instead. This file contains the instructions and links to download the file.

Example of execution

The following example shows how to execute any of the variant callers from the root of this repository:

WORKING_DIR=/path/to/working/directory
OUTPUT_DIR=/path/to/output/directory
EXTRA_DATA_DIR=./required_extra_data
REF_VERSION=37
NORMAL_SAMPLE=/path/to/normal/sample.bam
TUMOR_SAMPLE=/path/to/tumor/sample.bam
FASTA_REF=/path/to/reference.fasta
NUM_CORES=8
MAX_MEMORY=32

singularity exec -e <SINGULARITY_CONTAINER> bash ./executable_scripts/variant_caller_X_X_X.sh $WORKING_DIR $OUTPUT_DIR $EXTRA_DATA_DIR $REF_VERSION $NORMAL_SAMPLE $TUMOR_SAMPLE $FASTA_REF $NUM_CORES $MAX_MEMORY

# The above command might not work in some HPC environments. In that case, you can use the following command instead:
singularity exec -c --bind $WORKING_DIR,$OUTPUT_DIR,$EXTRA_DATA_DIR,$(dirname $NORMAL_SAMPLE),$(dirname $TUMOR_SAMPLE),$(dirname $FASTA_REF) <SINGULARITY_CONTAINER> bash ./executable_scripts/variant_caller_X_X_X.sh $WORKING_DIR $OUTPUT_DIR $EXTRA_DATA_DIR $REF_VERSION $NORMAL_SAMPLE $TUMOR_SAMPLE $FASTA_REF $NUM_CORES $MAX_MEMORY

About

Scripts and containers to run the variant callers originally used in ONCOLINER

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages