Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VAT validation rule #5 [VS-16] #7365

Merged
merged 8 commits into from
Jul 28, 2021
Merged

Add VAT validation rule #5 [VS-16] #7365

merged 8 commits into from
Jul 28, 2021

Conversation

rsasch
Copy link

@rsasch rsasch commented Jul 23, 2021

Closes https://broadworkbench.atlassian.net/browse/VS-16 by adding SQL for

There is a non-zero number of transcript fields with null values in the VAT.

@@ -245,7 +258,7 @@ task SchemaOnlyOneRowPerNullTranscript {
transcript_source is NULL AND
transcript is NULL
GROUP BY vid
HAVING num_rows = 1' > bq_variant_count.csv
HAVING num_rows > 1' > bq_variant_count.csv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: while you are here, there's a tiny typo below:
"# if the result of the query has any rows, that means there were vids will null transcripts and multiple"
I think will should be with?

Copy link
Contributor

@RoriCremer RoriCremer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the 'import "GvsWarpTasks.wdl" as Tasks' do?

@rsasch rsasch marked this pull request as draft July 24, 2021 01:51
@rsasch
Copy link
Author

rsasch commented Jul 24, 2021

converting to draft because SchemaNonzeroAcAn is failing

@RoriCremer
Copy link
Contributor

ah---this is what I get for making an example inputs file when we really didn't need one and for choosing the full AoU 1k release (vat_kc_vat_1) as the default. I wanted to run the validations here because it is the largest dataset and it is the AoU data (not just Anvil data) BUT it has no values for gvs_all_ac or gvs_all_an yet because that step wasn't implemented by the time of creation. (Validation #9 was added by Lee recently) The table vat_jul18 does have those values as it was created just last week, but may get cleaned up...so this might be a good best practices question for what we run this on in the future if there is ever an automated version?

@rsasch rsasch marked this pull request as ready for review July 28, 2021 15:09
@rsasch rsasch merged commit 3b066a4 into ah_var_store Jul 28, 2021
@rsasch rsasch deleted the rsa_add_vat_val_5 branch July 28, 2021 17:31
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants