Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X/Y chromosome reweighting for better extract shard runtime balance [VS-389] #7868

Merged
merged 22 commits into from
May 26, 2022

Conversation

mcovarr
Copy link
Collaborator

@mcovarr mcovarr commented May 24, 2022

No description provided.

@mcovarr mcovarr changed the title Vs 389 xy reweighting X/Y chromosome reweighting for better shard balance [VS-389] May 24, 2022
@mcovarr mcovarr changed the title X/Y chromosome reweighting for better shard balance [VS-389] X/Y chromosome reweighting for better extract shard runtime balance [VS-389] May 24, 2022
@codecov
Copy link

codecov bot commented May 24, 2022

Codecov Report

❗ No coverage uploaded for pull request base (ah_var_store@a4ac264). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff                @@
##             ah_var_store     #7868   +/-   ##
================================================
  Coverage                ?   86.296%           
  Complexity              ?     35191           
================================================
  Files                   ?      2170           
  Lines                   ?    164876           
  Branches                ?     17784           
================================================
  Hits                    ?    142281           
  Misses                  ?     16271           
  Partials                ?      6324           

Float y_bed_weight_scaling
}
command <<<
python3 <<FIN
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should possibly make this a separate script with tests and add it to the Docker image.

@mcovarr mcovarr closed this May 24, 2022
@mcovarr mcovarr reopened this May 24, 2022
@@ -404,16 +404,6 @@ private static long getQueryCostBytesProcessedEstimate(String queryString, Strin
return bytesProcessed;
}

public static StorageAPIAvroReader executeQueryWithStorageAPI(final String queryString,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by removal of unneeded method

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

scripts/variantstore/wdl/extract/scale_xy_bed_values.py Outdated Show resolved Hide resolved

if line.startswith('chrX') or line.startswith('chrY'):
scale_factor = x_scale_factor if line.startswith('chrX') else y_scale_factor
fields = line.split('\t')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to just validate that the bed has five fields here? - and throw an exception if not.

(not sure if there's any optional columns that could be in a bed, in which case your fields[-1] might be updating the wrong field).

@@ -404,16 +404,6 @@ private static long getQueryCostBytesProcessedEstimate(String queryString, Strin
return bytesProcessed;
}

public static StorageAPIAvroReader executeQueryWithStorageAPI(final String queryString,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

- name: GvsIngestTieout
subclass: WDL
primaryDescriptorPath: /scripts/variantstore/wdl/GvsIngestTieout.wdl
filters:
branches:
- master
- ah_var_store
- vs_261_ingest_errors
Copy link
Contributor

@RoriCremer RoriCremer May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I had looked at this pr earlier, I would have seen that you already did this! Thank you!!!!!

@@ -21,13 +21,11 @@ workflow GvsCreateFilterSet {
String? service_account_json_path
Int? SNP_VQSR_max_gaussians_override = 6
Int? SNP_VQSR_mem_gb_override
# this is the minimum number of samples where the SNP model will be created and applied in separate tasks
# (SNPsVariantRecalibratorClassic vs. SNPsVariantRecalibratorCreateModel and SNPsVariantRecalibratorScattered)
Int snps_variant_recalibration_threshold = 5000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the change, but it might be worth noting that for WARP classic this is done with 20k. Probably not necessary since it's purely implementation, but I know we've been trying to stay as true to WARP as possible

@mcovarr mcovarr merged commit 91c33df into ah_var_store May 26, 2022
@mcovarr mcovarr deleted the vs_389_xy_reweighting branch May 26, 2022 22:38
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants