Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VS-693] Add support for VQSR Lite to GvsCreateFilterSet #8157

Merged
merged 24 commits into from
Feb 2, 2023
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
ccc68f8
Added a new suite of tools for variant filtering based on site-level …
samuelklee Aug 9, 2022
78fdd8d
Added toggle for selecting resource-matching strategies and miscellan…
samuelklee Oct 11, 2022
c95945c
Adding use_allele_specific_annotation arg and fixing task with empty …
meganshand Sep 22, 2022
3b61d7c
first stab
rsasch Oct 17, 2022
fc1c11e
wire through WDL changes
rsasch Oct 18, 2022
18c33de
Merge branch 'ah_var_store' into rsa_vqsr_lite_poc
rsasch Oct 18, 2022
19fa6c3
fixed typo
rsasch Oct 18, 2022
971d82f
set model_backend input value
rsasch Oct 18, 2022
b888432
add gatk_override to JointVcfFiltering call
rsasch Oct 18, 2022
86d63c0
typo in indel_annotations
rsasch Oct 18, 2022
1b1ebf5
make model_backend optional
rsasch Oct 19, 2022
6b5c879
tabs and spaces
rsasch Oct 19, 2022
72a9d93
make all model_backends optional
rsasch Oct 19, 2022
d525fca
use gatk 4.3.0
rsasch Oct 19, 2022
917d1b7
no point in changing the table names as this is a POC
rsasch Oct 20, 2022
3c19b50
Merge branch 'ah_var_store' into rsa_vqsr_lite_poc
rsasch Oct 24, 2022
7e7c74f
adding new branch to dockstore
koncheto-broad Jan 4, 2023
b178fe5
adding in branching logic for classic VQSR vs VQSR-Lite
koncheto-broad Jan 6, 2023
95cad2e
implementing the separate schemas for the VQSR vs VQSR-Lite branches,…
koncheto-broad Jan 9, 2023
9c39f73
passing classic flag to indel run of CreateFilteringFiles
koncheto-broad Jan 10, 2023
3320837
Update GvsCreateFilterSet.wdl
koncheto-broad Jan 17, 2023
bfb1a28
Removed mapping error rate from estimate of denoised copy ratios outp…
samuelklee Jun 11, 2021
fac9ef0
Merge branch 'ah_var_store' into VS-693_VQSR_lite
koncheto-broad Jan 24, 2023
4b9fa62
cleanup up sloppy comment
koncheto-broad Jan 30, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .dockstore.yml
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,8 @@ workflows:
branches:
- master
- ah_var_store
- rsa_vqsr_lite_poc
- VS-693_VQSR_lite
- name: GvsPopulateAltAllele
subclass: WDL
primaryDescriptorPath: /scripts/variantstore/wdl/GvsPopulateAltAllele.wdl
Expand Down
8 changes: 7 additions & 1 deletion .github/workflows/gatk-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
wdlTest: [ 'RUN_CNV_GERMLINE_COHORT_WDL', 'RUN_CNV_GERMLINE_CASE_WDL', 'RUN_CNV_SOMATIC_WDL', 'RUN_M2_WDL', 'RUN_CNN_WDL' ]
wdlTest: [ 'RUN_CNV_GERMLINE_COHORT_WDL', 'RUN_CNV_GERMLINE_CASE_WDL', 'RUN_CNV_SOMATIC_WDL', 'RUN_M2_WDL', 'RUN_CNN_WDL', 'RUN_VCF_SITE_LEVEL_FILTERING_WDL' ]
continue-on-error: true
name: WDL test ${{ matrix.wdlTest }} on cromwell
steps:
Expand Down Expand Up @@ -349,3 +349,9 @@ jobs:
run: |
echo "Running CNN WDL";
bash scripts/cnn_variant_cromwell_tests/run_cnn_variant_wdl.sh;

- name: "VCF_SITE_LEVEL_FILTERING_WDL_TEST"
if: ${{ matrix.wdlTest == 'RUN_VCF_SITE_LEVEL_FILTERING_WDL' }}
run: |
echo "Running VCF Site Level Filtering WDL";
bash scripts/vcf_site_level_filtering_cromwell_tests/run_vcf_site_level_filtering_wdl.sh;
1 change: 1 addition & 0 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,7 @@ dependencies {

implementation 'org.apache.commons:commons-lang3:3.5'
implementation 'org.apache.commons:commons-math3:3.5'
implementation 'org.hipparchus:hipparchus-stat:2.0'
implementation 'org.apache.commons:commons-collections4:4.1'
implementation 'org.apache.commons:commons-vfs2:2.0'
implementation 'org.apache.commons:commons-configuration2:2.4'
Expand Down
1 change: 1 addition & 0 deletions scripts/gatkcondaenv.yml.template
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ dependencies:
- conda-forge::matplotlib=3.2.1
- conda-forge::pandas=1.0.3
- conda-forge::typing_extensions=4.1.1 # see https://github.com/broadinstitute/gatk/issues/7800 and linked PRs
- conda-forge::dill=0.3.4 # used for pickling lambdas in TrainVariantAnnotationsModel
koncheto-broad marked this conversation as resolved.
Show resolved Hide resolved

# core R dependencies; these should only be used for plotting and do not take precedence over core python dependencies!
- r-base=3.6.2
Expand Down
294 changes: 185 additions & 109 deletions scripts/variantstore/wdl/GvsCreateFilterSet.wdl

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions scripts/vcf_site_level_filtering_cromwell_tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Filtering Automated Tests for WDL

**This directory is for GATK devs only**

This directory contains scripts for running Variant Site Level WDL tests in the automated travis build environment.
koncheto-broad marked this conversation as resolved.
Show resolved Hide resolved

Please note that this only tests whether the WDL will complete successfully.

Test data is a "plumbing test" using a small portion of a 10 sample callset.
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/bin/bash -l
set -e
#cd in the directory of the script in order to use relative paths
script_path=$( cd "$(dirname "${BASH_SOURCE}")" ; pwd -P )
cd "$script_path"

WORKING_DIR=/home/runner/work/gatk

set -e
echo "Building docker image for VCF Site Level Filtering WDL tests (skipping unit tests)..."

#assume Dockerfile is in root
echo "Building docker without running unit tests... ========="
cd $WORKING_DIR/gatk

# IMPORTANT: This code is duplicated in the cnv and M2 WDL test.
if [ ! -z "$CI_PULL_REQUEST" ]; then
HASH_TO_USE=FETCH_HEAD
sudo bash build_docker.sh -e ${HASH_TO_USE} -s -u -d $PWD/temp_staging/ -t ${CI_PULL_REQUEST};
echo "using fetch head:"$HASH_TO_USE
else
HASH_TO_USE=${CI_COMMIT}
sudo bash build_docker.sh -e ${HASH_TO_USE} -s -u -d $PWD/temp_staging/;
echo "using travis commit:"$HASH_TO_USE
fi
echo "Docker build done =========="

cd $WORKING_DIR/gatk/scripts/
sed -r "s/__GATK_DOCKER__/broadinstitute\/gatk\:$HASH_TO_USE/g" vcf_site_level_filtering_cromwell_tests/vcf_site_level_filtering_travis.json >$WORKING_DIR/vcf_site_level_filtering_travis.json
echo "JSON FILES (modified) ======="
cat $WORKING_DIR/vcf_site_level_filtering_travis.json
echo "=================="


echo "Running Filtering WDL through cromwell"
ln -fs $WORKING_DIR/gatk/scripts/vcf_site_level_filtering_wdl/JointVcfFiltering.wdl
cd $WORKING_DIR/gatk/scripts/vcf_site_level_filtering_wdl/
java -jar $CROMWELL_JAR run JointVcfFiltering.wdl -i $WORKING_DIR/vcf_site_level_filtering_travis.json
koncheto-broad marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"JointVcfFiltering.gatk_docker": "__GATK_DOCKER__",
"JointVcfFiltering.vcf": ["/home/runner/work/gatk/gatk/src/test/resources/large/filteringJointVcf/test_10_samples.22.avg.vcf.gz",
"/home/runner/work/gatk/gatk/src/test/resources/large/filteringJointVcf/test_10_samples.23.avg.vcf.gz"],
"JointVcfFiltering.vcf_index": ["/home/runner/work/gatk/gatk/src/test/resources/large/filteringJointVcf/test_10_samples.22.avg.vcf.gz.tbi",
"/home/runner/work/gatk/gatk/src/test/resources/large/filteringJointVcf/test_10_samples.23.avg.vcf.gz.tbi"],
"JointVcfFiltering.sites_only_vcf": "/home/runner/work/gatk/gatk/src/test/resources/large/filteringJointVcf/test_10_samples.sites_only.vcf.gz",
"JointVcfFiltering.sites_only_vcf_index": "/home/runner/work/gatk/gatk/src/test/resources/large/filteringJointVcf/test_10_samples.sites_only.vcf.gz.tbi",
"JointVcfFiltering.basename": "test_10_samples",
"JointVcfFiltering.snp_annotations": "-A ReadPosRankSum -A FS -A SOR -A QD -A AVERAGE_TREE_SCORE -A AVERAGE_ASSEMBLED_HAPS -A AVERAGE_FILTERED_HAPS",
"JointVcfFiltering.indel_annotations": "-A MQRankSum -A ReadPosRankSum -A FS -A SOR -A QD -A AVERAGE_TREE_SCORE",
"JointVcfFiltering.model_backend": "PYTHON_IFOREST",
"JointVcfFiltering.use_allele_specific_annotations": false
}
Loading