Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VS-816 Keeping ingestion under quota #8193

Merged
merged 3 commits into from
Feb 9, 2023
Merged

VS-816 Keeping ingestion under quota #8193

merged 3 commits into from
Feb 9, 2023

Conversation

koncheto-broad
Copy link

@koncheto-broad koncheto-broad commented Feb 8, 2023

Limiting scattering size in ingest to keep beta customers under quota.

Successful run on NHGRI AnVIL dataset: https://app.terra.bio/#workspaces/gvs-dev/NHGRI_AnVIL_3K%20hatcher/job_history/a9c2a81b-d81c-4f7a-a433-4096ccc7b579

Quota behavior during successful run:
Screenshot 2023-02-08 at 3 37 36 PM

Next successful run after minor refactoring: https://app.terra.bio/#workspaces/gvs-dev/GVS%20Tiny%20Quickstart%20hatcher/job_history/8a34477f-af3b-4e19-8184-862ed1c2cba3

…t with different numbers (if quotas are increased for a customer) without having to modify our code
Copy link
Collaborator

@mcovarr mcovarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, thanks for taking this on! 🎉 👍

scripts/variantstore/wdl/GvsImportGenomes.wdl Outdated Show resolved Hide resolved
else if num_samples < 1000 then 1
else num_samples / 1000
else if beta_customer_rate_limit then beta_effective_load_data_batch_size
else if num_samples < 1000 then 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the Broad limits can be structured similarly

  # Broad users enjoy higher quotas and can scatter more widely than beta users
  Int broad_max_scatter = 1000
  Int broad_effective_load_data_batch_size = if num_samples < broad_max_scatter then 1
                                             else num_samples / broad_max_scatter
  .
  .
  .
  Int effective_load_data_batch_size = if (defined(load_data_batch_size)) then select_first([load_data_batch_size])
                                       else if beta_customer_rate_limit then beta_effective_load_data_batch_size
                                            else broad_effective_load_data_batch_size

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really good suggestion. To be honest, I was frustrated with my inability to have an if (beta_customer_rate_limit) {} block that selectively overwrote the computed value of effective_load_data_batch_size, so this is the cleanest way I could think to do it with my limited WDL experience. I was half expecting one of you to show me an easier way to handle the beta case

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've restructured the code in that portion to unify the two cases a lot more, and made the variable name for beta behavior a little more self-explanatory. I'm running it again (just in case some silly typo or something caused an issue), so I'll post the successful run tomorrow when I have it.

@codecov
Copy link

codecov bot commented Feb 8, 2023

Codecov Report

❗ No coverage uploaded for pull request base (ah_var_store@d07e773). Click here to learn what that means.
The diff coverage is n/a.

Additional details and impacted files
@@               Coverage Diff                @@
##             ah_var_store     #8193   +/-   ##
================================================
  Coverage                ?   86.219%           
  Complexity              ?     35519           
================================================
  Files                   ?      2191           
  Lines                   ?    166339           
  Branches                ?     17900           
================================================
  Hits                    ?    143416           
  Misses                  ?     16543           
  Partials                ?      6380           

@koncheto-broad koncheto-broad merged commit 0c48d6d into ah_var_store Feb 9, 2023
@koncheto-broad koncheto-broad deleted the VS-816 branch February 9, 2023 19:47
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants