Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document AoU SOP (up to the VAT) [VS-63] #7807

Merged
merged 6 commits into from
Apr 26, 2022
Merged

Document AoU SOP (up to the VAT) [VS-63] #7807

merged 6 commits into from
Apr 26, 2022

Conversation

rsasch
Copy link

@rsasch rsasch commented Apr 25, 2022

No description provided.

4. the "cohort_extract_table_prefix" input from `GvsExtractCallset` step
5. the "filter_set_name" input from `GvsCreateFilterSet` step
## Prerequisites
- If this is the first time running the GVS pipeline in a particular Google billing project, use your GCP account team to create a support ticket for the BigQuery team that includes "enabling cluster metadata pruning support for the BQ Read API." This enables a pre-GA feature that dramatically reduces the amount of data scanned reducing both cost and runtime.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"enabling cluster metadata pruning support for the BQ Read API" means autopoking?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is a separate issue (otherwise referred to as "whitelisting"). I need to confirm that this has been resolved.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 'pre-GA'?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a term that Google uses, I believe it means "General Audience".

scripts/variantstore/AOU_DELIVERABLES.md Outdated Show resolved Hide resolved
scripts/variantstore/AOU_DELIVERABLES.md Outdated Show resolved Hide resolved
scripts/variantstore/AOU_DELIVERABLES.md Outdated Show resolved Hide resolved
scripts/variantstore/AOU_DELIVERABLES.md Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Apr 25, 2022

Codecov Report

❗ No coverage uploaded for pull request base (ah_var_store@614a0f7). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff                @@
##             ah_var_store     #7807   +/-   ##
================================================
  Coverage                ?   86.295%           
  Complexity              ?     35191           
================================================
  Files                   ?      2170           
  Lines                   ?    164837           
  Branches                ?     17775           
================================================
  Hits                    ?    142246           
  Misses                  ?     16265           
  Partials                ?      6326           

4. the "cohort_extract_table_prefix" input from `GvsExtractCallset` step
5. the "filter_set_name" input from `GvsCreateFilterSet` step
## Prerequisites
- If this is the first time running the GVS pipeline in a particular Google billing project, use your GCP account team to create a support ticket for the BigQuery team that includes "enabling cluster metadata pruning support for the BQ Read API." This enables a pre-GA feature that dramatically reduces the amount of data scanned reducing both cost and runtime.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 'pre-GA'?

scripts/variantstore/AOU_DELIVERABLES.md Outdated Show resolved Hide resolved
4. GvsCreateAltAllele
5. GvsCreateFilterSet (see [naming conventions doc](https://docs.google.com/document/d/1pNtuv7uDoiOFPbwe4zx5sAGH7MyxwKqXkyrpNmBxeow) for guidance on what to name the filter set, which you will need to keep track of for the `GvsExtractCallset` WDL).
6. GvsPrepareRangesCallset needs to be run twice, once with `control_samples` set to "true" (see [naming conventions doc](https://docs.google.com/document/d/1pNtuv7uDoiOFPbwe4zx5sAGH7MyxwKqXkyrpNmBxeow) for guidance on what to use for `extract_table_prefix` or cohort prefix, which you will need to keep track of for the `GvsExtractCallset` WDL).
7. GvsExtractCallset needs to be run twice, once with `control_samples` set to "true", and with the `filter_set_name` and `extract_table_prefix` from step 5 & 6. Include a valid (and secure) "output_gcs_dir" parameter, which is where the VCF and interval list files will go.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
7. GvsExtractCallset needs to be run twice, once with `control_samples` set to "true", and with the `filter_set_name` and `extract_table_prefix` from step 5 & 6. Include a valid (and secure) "output_gcs_dir" parameter, which is where the VCF and interval list files will go.
7. GvsExtractCallset needs to be run twice, once with `control_samples` set to "true", and with the `filter_set_name` and `extract_table_prefix` from step 5 & 6. Include a valid (and secure) "output_gcs_dir" parameter, which is where the VCF and interval list files will go.

@rsasch rsasch merged commit ba7a26c into ah_var_store Apr 26, 2022
@rsasch rsasch deleted the rsa_aou_sop branch April 26, 2022 18:43
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants