Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WDL to extract Avro files for Hail import [VS-579] #7981

Merged
merged 24 commits into from
Aug 10, 2022

Conversation

mcovarr
Copy link
Collaborator

@mcovarr mcovarr commented Aug 9, 2022

Successful run here.

@codecov
Copy link

codecov bot commented Aug 9, 2022

Codecov Report

❗ No coverage uploaded for pull request base (ah_var_store@798d4e8). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff                @@
##             ah_var_store     #7981   +/-   ##
================================================
  Coverage                ?   86.247%           
  Complexity              ?     35205           
================================================
  Files                   ?      2173           
  Lines                   ?    165016           
  Branches                ?     17792           
================================================
  Hits                    ?    142321           
  Misses                  ?     16368           
  Partials                ?      6327           

Copy link
Collaborator

@gbggrant gbggrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Example run anywhere?

@mcovarr
Copy link
Collaborator Author

mcovarr commented Aug 9, 2022

Successful run linked in the description.

@gbggrant
Copy link
Collaborator

gbggrant commented Aug 9, 2022

Thanks - missed that.

@mcovarr
Copy link
Collaborator Author

mcovarr commented Aug 9, 2022

I missed exporting the tranche data 🙈, fixes incoming

bq query --nouse_legacy_sql --project_id=~{project_id} "
EXPORT DATA OPTIONS(
uri='${avro_prefix}/vqsr_tranche/vqsr_tranche_*.avro', format='AVRO', compression='SNAPPY') AS
SELECT model, truth_sensitivity, min_vqslod, filter_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to want to drop filter_name here, but I'm adding it to the questions for Tim this afternoon

}

# Superpartitions have max size 4000. The inner '- 1' is so the 4000th (and multiples of 4000) sample lands in the
# appropriate partition, the outer '+ 1' is to iterate over the correct number of partitions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only thing that I'd want to see tested out (before the tie out) and I'm not sure of the best way to do this...maybe this can wait until we do this with 10k samples

@mcovarr mcovarr merged commit 42a9382 into ah_var_store Aug 10, 2022
@mcovarr mcovarr deleted the vs_579_vds_avro_wdl branch August 10, 2022 21:06
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants