Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool for arrays QC metrics calculations #6812

Merged
merged 2 commits into from
Sep 17, 2020
Merged

Tool for arrays QC metrics calculations #6812

merged 2 commits into from
Sep 17, 2020

Conversation

meganshand
Copy link
Contributor

Pulls down a temp table of genotype counts, calculates excess het and call rate and writes them to a tsv for future upload.

"SELECT * FROM `" + genotypeCountsTable + "`";

//Execute Query
final TableResult result = BigQueryUtils.executeQuery(genotypeCountQueryString);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that it's MUCH faster to use the storage API here (like 10-20 times) since you're pulling out the entire table. You can see what I do for this in the ArrayExtractCohort for probe_info

thisRow.add(String.valueOf(excessHetPval));

Double callRate = 1.0 - ((double) noCalls / sampleCount);
thisRow.add(String.valueOf(callRate));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to format these to any kind of fixed precision? I know some of the GATK tools go to 3 decimal places. Same fo all the doubles

Double excessHetPval = ExcessHet.calculateEH(genotypeCounts, sampleCount).getRight();
thisRow.add(String.valueOf(excessHetPval));

Double callRate = 1.0 - ((double) noCalls / sampleCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double or double?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for other types. If they can be null, use the object (Double) and be sure to handle the null case when you use them. if not use the primitive (double) and you don't have to worry!

@meganshand meganshand merged commit 9b83e93 into ah_var_store Sep 17, 2020
@meganshand meganshand deleted the ms_qc_tool branch September 17, 2020 18:55
kcibul pushed a commit that referenced this pull request Jan 29, 2021
kcibul pushed a commit that referenced this pull request Feb 1, 2021
kcibul pushed a commit that referenced this pull request Mar 9, 2021
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants