Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create WDL to validate VAT and add first test #7352

Merged
merged 11 commits into from
Jul 20, 2021
Merged

Conversation

rsasch
Copy link

@rsasch rsasch commented Jul 15, 2021

The idea is that this WDL will run all the checks for each release of the VAT table, one call for each validation. The first validation rule ("Validation Check confirms that data is put into the VAT table after completing without an error.") is included as a model for subsequent calls.

  • workflow should succeed if it's able to try all tests
  • workflow output validation_results will contain details of each test result in an array of {"testName": "result details"}:
    Example 1 — fail
    { "EnsureVatTableHasVariants": "FAIL: The VAT table spec-ops-aou.rsa_gvs_quickstart.rsa_scratch has no variants in it." }
    Example 2 — pass
    { "EnsureVatTableHasVariants": "PASS: The VAT table spec-ops-aou.anvil_100_for_testing.aou_shard_223_vat has 294821 variants in it." },
    Example 3 — the test wasn't able to run
    { "EnsureVatTableHasVariants": "Something went wrong. The attempt to count the variants returned: Error in query string: Error processing job 'spec-ops- aou:bqjob_r357c4b6fe6b0c6fb_0000017aac301de7_1': Unrecognized name: vid at [1:24]" }

Closes https://github.com/broadinstitute/dsp-spec-ops/issues/364

@kcibul
Copy link
Contributor

kcibul commented Jul 20, 2021

Hey @rsasch -- Just thinking out loud here… but does WDL seem like the right "language" to implement these sort of checks? It just feels like there is a ton of boilerplate, the WDL/bash constructs are a bit hard to follow, and I'm sure the development cycles were pretty slow waiting for cromwell to spin up all those VMs, etc.

An alternative would be to write all these validations as a python script, which could make better use of structure (ie authenticating once, etc) and I bet would be quite a bit more readable. You could also run it locally if needed for development, debugging and then ultimately wrap it into a single-task WDL so it could be run easily from Terra.

Happy to chat more

@rsasch
Copy link
Author

rsasch commented Jul 20, 2021

@kcibul My reasoning for doing it in WDL is to better integrate with the process that creates the VAT table, and therefore make sure that the person running it has access (and knows the location of) not only to the VAT table but also the intermediary steps (e.g. the annotation JSON files that are output from NIRVANA). Not all of the validation steps need to be all bash; the first one was because it's literally just a call to make sure a table exists and has rows with vid values in it. Other rules (e.g. rule #2) will most likely need either python or jq to run.

@rsasch
Copy link
Author

rsasch commented Jul 20, 2021

Discussed with @kcibul (and others) offline. Conclusion: the majority of the validation code might end up being non-WDL with just one or two WDL tasks that call pthyon....etc., but it will be divided up into separate WDL tasks for now for development, and once it's done (or mostly) done, it can always be restructured to have the code in whatever place/structure makes the most sense going forward.

@rsasch rsasch merged commit 1592e65 into ah_var_store Jul 20, 2021
@rsasch rsasch deleted the rsa_add_vat_val_1 branch July 20, 2021 17:48
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants