Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vs 629 failure to retrieve job information during ingest #8047

Conversation

koncheto-broad
Copy link

hardening standard quickstart pipeline against errors when a location is specified

@codecov
Copy link

codecov bot commented Oct 6, 2022

Codecov Report

❗ No coverage uploaded for pull request base (ah_var_store@01b2880). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 38e90c3 differs from pull request most recent head 07c6b83. Consider uploading reports for the commit 07c6b83 to get more accurate results

Additional details and impacted files
@@               Coverage Diff                @@
##             ah_var_store     #8047   +/-   ##
================================================
  Coverage                ?   16.953%           
  Complexity              ?      4702           
================================================
  Files                   ?      1375           
  Lines                   ?     82247           
  Branches                ?     13121           
================================================
  Hits                    ?     13943           
  Misses                  ?     66245           
  Partials                ?      2059           

@koncheto-broad
Copy link
Author

This PR contains several distinct modifications necessary to protect our pipeline against errors when a location is specified on a BQ dataset

  1. modifying several WDLs that used the BQ CLI that specified "--location=US" Turns out, it is unnecessary and breaks things when the location is not US
  2. modifying two paths through BigQueryUtils to harden against non-"US" locations, including explicitly passing in the dataset id to getQueryCostBytesProcessedEstimate so its location can be looked up and passed into the dry run job
  3. cutting our reliance on bqutil to be installed in the location in which our queries run by supplying a local version of "median" as a UDF (as udf_media.sql pulled in through changes to BigQueryUtils.java and referenced in feature_extract.sl)

Only partially related, this PR also contains the creation of the script/variantstore/utils directory to hold useful scripts, and the useful pushGATKtoGCS script for pushing jars to an easily-referenced location for WDLs (h/t to Miguel)

The entire tragic history of successes and failures can be seen in the job history of the workspace https://app.terra.bio/#workspaces/gvs-dev/GVS%20Tiny%20Quickstart%20hatcher/job_history

Every stage of the quickstart can be verified within to--eventually and only after the gods deemed my suffering sufficient--have passed.

Copy link
Collaborator

@mcovarr mcovarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Only a couple of very minor nits

Copy link
Collaborator

@mcovarr mcovarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor update requested

scripts/variantstore/wdl/GvsAssignIds.wdl Outdated Show resolved Hide resolved
@koncheto-broad koncheto-broad merged commit 5f1f998 into ah_var_store Oct 11, 2022
@koncheto-broad koncheto-broad deleted the VS-629-failure-to-retrieve-job-information-during-ingest branch October 11, 2022 21:50
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants