Skip to content

Nextflow script for base quality score recalibration of bam files using GATK


Notifications You must be signed in to change notification settings


Repository files navigation


Nextflow pipeline for base quality score recalibration with GATK processing

CircleCI Docker Hub

Workflow representation


Nextflow pipeline for base quality score recalibration and quality control of bam files using GATK


  1. Nextflow: for common installation procedures see the IARC-nf repository.

  2. multiQC

  3. GATK4 must be in the PATH variable

  4. GATK bundle VCF files with lists of indels and SNVs (recommended: 1000 genomes indels, dbsnp VCF)

You can provide a config file to customize the multiqc report (see


Type Description
--input_folder a folder with bam files


  • Mandatory

Name Example value Description
--ref ref.fa reference genome fasta file for GATK
  • Optional

Name Default value Description
--cpu 2 number of CPUs
--mem 32 memory for mapping
--output_folder . output folder for aligned BAMs
--snp_vcf dbsnp.vcf VCF file with known variants for GATK BQSR
--indel_vcf Mills_100G_indels.vcf VCF file with known indels for GATK BQSR
--multiqc_config null config yaml file for multiqc
  • Flags

Name Description
--help print usage and optional parameters


To run the pipeline on a series of bam files in folder bam, a reference genome with indexes at ref.fa, and known snps and indels from the gatk bundle, one can type:

nextflow run iarcbioinfo/BQSR-nf --input_folder bam --ref ref.fa --snp_vcf GATK_bundle/dbsnp_146.hg38.vcf.gz --indel_vcf GATK_bundle/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz


Type Description
BAM/file.bam BAM files of alignments or realignments
BAM/file.bam.bai BAI files of alignments or realignments
QC/multiqc_BQSR_report.html multiqc report
QC/multiqc_BQSR_report_data folder with data used to compute multiqc report
QC/BAM/BQSR/file_recal.table table of scores before recalibration
QC/BAM/BQSR/file_BQSRecalibrated_recal.table table of scores after recalibration
QC/BAM/BQSR/file_recalibration_plots.pdf before/after recalibration plots

The output_folder directory contains two subfolders: BAM and QC

Directed Acyclic Graph



Name Email Description
Nicolas Alcala* [email protected] Developer to contact for support