Restore 3 Gnarly tests #8892

ldgauthier · 2024-06-25T17:17:25Z

Accidentally turned off 3 Gnarly tests that didn't get run since #8741

* deleted VDS * only one left

…tion of Delta (#8205) * Lees name * add vds validation script written by Tim * fix rd tim typo * make sure temp dir is set and not default for validate() * swap to consistent kebab case Co-authored-by: Miguel Covarrubias <[email protected]> * clean up validation * put init in the right place * add proper example to notes * update code formatting --------- Co-authored-by: Miguel Covarrubias <[email protected]>

* Lees name * add vds validation script written by Tim * fix rd tim typo * make sure temp dir is set and not default for validate() * swap to consistent kebab case Co-authored-by: Miguel Covarrubias <[email protected]> * clean up validation * put init in the right place * add proper example to notes * update code formatting * update review --------- Co-authored-by: Miguel Covarrubias <[email protected]>

* Don't run gatk tests when the only changes in a commit are in the scripts/variantstore directory.

* laying framework for FOFN bulk import code * adding in terra notebook utils code * updating wdl * updating environment variables to make this work better * quotey McBetterQuotes * extra environment variables * normalizing variable name with other wdls that require it * gotta explicitly set WORKSPACE_NAMESPACE to the google project as well. Apparently. * typoooooooooooooooooo * Didn't pipe the output files the entire way up * whoopsie * typo * two updates after testing: 1. We do NOT want to assume that the sample ids we want are in the name field. Pass that through as a parameter. 2. We want to explicitly pause every 500 samples, as that's our page size. It slows our requests down enough to not spam the backend server and hit 503 errors, although it does slow down the rate at which we can write the files if the dataset is too big. Which shouldn't be a concern, because as long as it doesn't cause errors it is still a hands off process. 3. We want to account to heterogenous data. In AoU Delta, for instance, the control samples keep their vcf and vcf_index data in a different field. This would cause the whole thing to fail if we weren't accounting for that explicitly, and now we generate an errors.txt file that will hold the row that we couldn't find the correct columns for so they can be examined later * silly mistake copying the functioning code over from the workbook * making script more robust against specifying imaginary columns in the data table and being slightly more informative in the output of the python script * increasing the size of the disk this is running on for the sake of efficiency (and handling larger callsets) * Passing errors up * update params * short term testing (rate lim) * make it only 25 shards! * add workspace id scraping * add workspace id scraping fixup * this is not functioning--need to curl in the wdl * clean up vcfs so we dont run out of space * add duplicates test to the shard loading * clean up namespace prep --------- Co-authored-by: Aaron Hatcher <[email protected]>

* Use the annotation 'AS_MQ' for indels.

…0] (#8274)

… table (#8278) * Remove the unneeded SCORE field from the filter_set_info_vqsr table * Updated the docker images.

* add queries for testing mismatched sites and variants across possible duplicates * still need to wire these through * plumb thru dup validation * dockstore for testing * update docker * add xtrace * better bool logic * clean up bash * okay lets try ripping shit out to get this to work * okay lets put a few lines back * ok that worked, lets swap for better errors * short term remove clinvar * review changes * update docker * explain removal of clinvar test

* Adding tests for ExtractCohortLite.

* Simple fix to have the header of the VAT tsv to use tab characters.

* Updated to latest version of VQSR Lite (from Master) * Ported tests and files for VQSR Lite over * Refactored VQSR Classic code into its own WDL

) * Add support for VQSR Lite to GvsExtractCohortFromSampleNames.wdl * Remove obsolete gatk override jar

* add python script to our repo * use the new python script! * remove whl from integration test * move script location for testing * remove the damn wheel! * add the replacement hail script * proper renaming * update docker

* Refactoring of ExtractCohortLite into ExtractCohort.

…ue. (#8312) * Update override jar to fix support issue.

* Fix bug in VCF Integration test

* add Aarons changes * put terra token in python * id not bucket * hardcode for testing * do we need a new docker image? * set workspace info * pull in name from rawls * pass output locations * add back prepare * add GvsImportGenomes back * update python for grabbing cols * split methods for easier testing * set defaults, but allow optional overrides for sample table and id * add unit test for python column guessing * clean up python for testing * add proper docker * is this where the loop is coming from? * better names * remove testing artifact * add back problem lines to the test * throw out columns with values other than strings * set defaults in the right place

* Optionally extract to bgz format. * Set bgzipping to be off (everywhere) by default. * Update assert_identical_outputs to handle bgzipped outputs.

* Have GvsAssignIds.wdl validate that input sample names (in the provided input file) are unique.

…0] (#8822)

* Compress the tarball saves a bit. * Remove unused contigs from interval_list files by grepping. --------- Co-authored-by: Miguel Covarrubias <[email protected]>

* Change extract so that when we filter at the genotype level (with FT) the VCF header has the filter definition in the FORMAT field. * Also minor renaming of ExtractCohort argument. * Point to updated truth.

* Add ValidateVariants to our tests. * Bringing in Rori's change to add EXCESS_ALLELES to VCF Headers. * Updated truth path.

* remove the field 'yng_status' from the variant_data as_vqsr status dict of structs.

* Have GvsCreateVATfromVDS.wdl take sites-only-vcf as an optional input. * Added logic to allow/disallow CopyFile to overwrite.

* checkpointing here to switch branches * locally working first pass at adding in the ploidy info. Still needs to have the arguments passed through so it works in the WDLs * Propagating changes up through the wdl * Stupid WDL substitution mistake * On a roll with WDL today wheeeeee * Cleaning up slightly * PR feedback * PR feedback v2: Ploidy Boogaloo

* Fix Chromosome Encoding used in pgen merge.

…r_to_ah_var_store_again

…_master_to_ah_var_store_again

gatk-bot · 2024-06-25T17:41:05Z

Github actions tests reported job failures from actions build 9666825011
Failures in the following jobs:

Test Type	JDK	Job ID	Logs
unit	17.0.6+10	9666825011.12	logs
unit	17.0.6+10	9666825011.1	logs

RoriCremer and others added 30 commits March 13, 2023 14:30

Add note that we deleted a VDS! (#8214)

b09d909

* deleted VDS * only one left

Add a test exclusion for gvs scripts (#8250)

bb6806b

* Don't run gatk tests when the only changes in a commit are in the scripts/variantstore directory.

Intro to Cosmos Spike [VS-845] (#8254)

3880421

Update variants base image [VS-866] (#8262)

5e19ec0

Rename tieout WDL input to not match name of output [VS-860] (#8265)

2dd76f7

VS-849 - Use the annotation 'AS_MQ' for indels. (#8261)

529b078

* Use the annotation 'AS_MQ' for indels.

Tidy up ExtractCohortTest so it can be run in IntelliJ cleanly [VS-86…

8217073

…0] (#8274)

VS-838. Remove the unneeded SCORE field from the filter_set_info_vqsr…

23a64a7

… table (#8278) * Remove the unneeded SCORE field from the filter_set_info_vqsr table * Updated the docker images.

VS-883 - Add tests for extract cohort lite (#8284)

4ab6bde

* Adding tests for ExtractCohortLite.

Fix broken gsutil in Variants Docker image [VS-888] (#8289)

0f24625

Adding the uber_monitor.py script (#8268)

ce3a5c7

VS-885 - Fix VAT TSV (#8286)

dc8b800

* Simple fix to have the header of the VAT tsv to use tab characters.

Sanity check variantstore images before publishing [VS-889] (#8291)

17afee4

VS-776. Update to latest version of VQSR Lite. (#8269)

f4a355c

* Updated to latest version of VQSR Lite (from Master) * Ported tests and files for VQSR Lite over * Refactored VQSR Classic code into its own WDL

Azure SQL Database Ingest [VS-879] (#8293)

a2ffeb8

VS-895. Add VQSR lite support to 'extract cohort by sample names' (#8298

daeae13

) * Add support for VQSR Lite to GvsExtractCohortFromSampleNames.wdl * Remove obsolete gatk override jar

Move monitoring script to public bucket [VS-908] (#8303)

fc2c7f7

remove GvsAoUReblockGVCF.wdl and references to it (#8300)

6dc27f0

Add python script to our repo (#8282)

0b4e305

* add python script to our repo * use the new python script! * remove whl from integration test * move script location for testing * remove the damn wheel! * add the replacement hail script * proper renaming * update docker

Refactor to put 'Lite' functionality into ExtractCohort. (#8295)

834e6da

* Refactoring of ExtractCohortLite into ExtractCohort.

Serverless Azure SQL Database costs [VS-907] (#8309)

928ffe9

VS-917. Update override jar in GvsJointCalling wdl to fix support iss…

6a0b1a4

…ue. (#8312) * Update override jar to fix support issue.

Fix potential bug in VCF Integration test (#8316)

316d013

* Fix bug in VCF Integration test

RWB Search Export [VS-909] (#8311)

b64a252

No empty alts if you please [VS-918] (#8326)

ebe4835

gbggrant and others added 27 commits May 8, 2024 06:49

VS-1335 bgzip on extract ah var store edition (#8809)

bb957c7

* Optionally extract to bgz format. * Set bgzipping to be off (everywhere) by default. * Update assert_identical_outputs to handle bgzipped outputs.

VS_1327 ensure sample ids are unique (#8818)

df03bd7

* Have GvsAssignIds.wdl validate that input sample names (in the provided input file) are unique.

PGEN non-bonkers default disk size [VS-1279]

09bf147

Change inputs from optional to required according to feedback [VS-130…

c7ec1ba

…0] (#8822)

Add step function for VAT WDL scatter [VS-1355] (#8824)

8ed3a85

VS-1368 The tarball is too damn big (#8829)

527be7a

* Compress the tarball saves a bit. * Remove unused contigs from interval_list files by grepping. --------- Co-authored-by: Miguel Covarrubias <[email protected]>

VS-1336 - It's not a site FILTER (#8773)

bce49f1

* Change extract so that when we filter at the genotype level (with FT) the VCF header has the filter definition in the FORMAT field. * Also minor renaming of ExtractCohort argument. * Point to updated truth.

Fix integration tests [VS-1374] (#8833)

f37facb

Scatter adjustments [VS-516] (#8835)

80245ce

Support > 100 vet tables [VS-1377] (#8839)

a25a3c5

VS-1397 Update Tests - ah_var_store edition (#8846)

b081919

* Add ValidateVariants to our tests. * Bringing in Rori's change to add EXCESS_ALLELES to VCF Headers. * Updated truth path.

bq query audit [VS-1396] (#8847)

1d17d17

Clarify and set default for interval_list input [VS-1391] (#8853)

20cdd9b

VS-430 Remove YNG status from vds (#8861)

8e7053f

* remove the field 'yng_status' from the variant_data as_vqsr status dict of structs.

VS-1395 Have create vat from vds optionally take sites only vcf (#8851)

27133bd

* Have GvsCreateVATfromVDS.wdl take sites-only-vcf as an optional input. * Added logic to allow/disallow CopyFile to overwrite.

PGEN pvar files not compressed [VS-1412] (#8872)

7fcef54

updating docker after ploidy change (#8876)

52ee9d4

VS-1413 Fix pgen merge chrom encoding (#8875)

7f5ad8a

* Fix Chromosome Encoding used in pgen merge.

Merge remote-tracking branch 'origin/master' into vs_1178_merge_maste…

e170f1b

…r_to_ah_var_store_again

fix compilation

d4b680c

update Dockers

8e097e1

Dockstore

e548bd5

Some UI Improvements for GvsCreateVATfromVDS.wdl (#8881)

0e7ada4

Merge remote-tracking branch 'origin/ah_var_store' into vs_1178_merge…

164aa5a

…_master_to_ah_var_store_again

Put back turned off tests

26836f5

Update no-calls in Gnarly haploid test

44cb7a9

ldgauthier closed this Jun 25, 2024

ldgauthier deleted the ldg_turnOnGnarlyTests branch June 25, 2024 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore 3 Gnarly tests #8892

Restore 3 Gnarly tests #8892

ldgauthier commented Jun 25, 2024

gatk-bot commented Jun 25, 2024 •

edited

Loading

Restore 3 Gnarly tests #8892

Restore 3 Gnarly tests #8892

Conversation

ldgauthier commented Jun 25, 2024

gatk-bot commented Jun 25, 2024 • edited Loading

gatk-bot commented Jun 25, 2024 •

edited

Loading