Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate and Store site-level QCs #7197

Merged
merged 10 commits into from
Apr 12, 2021
Merged

Calculate and Store site-level QCs #7197

merged 10 commits into from
Apr 12, 2021

Conversation

kcibul
Copy link
Contributor

@kcibul kcibul commented Apr 10, 2021

Addresses 219

Major changes

  • calculate site level metrics in feature_extract.sql
  • extract metrics, apply thresholds, and set filter field in ExtractFeature
  • CreateSiteFilteringFiles to translate from input VCF with filter fields into format for BQ loading, especially location fields
  • update WDL to call CreateSiteFilteringFiles and upload results to BQ

Minor changes

  • added call_GQ to alt_allele creation
  • reduced memory requirements in WDL

@gatk-bot
Copy link

gatk-bot commented Apr 10, 2021

Travis reported job failures from build 33655
Failures in the following jobs:

Test Type JDK Job ID Logs
unit openjdk11 33655.13 logs
unit openjdk8 33655.3 logs
unit openjdk11 33655.13 logs
unit openjdk8 33655.3 logs

CreateFilteringFiles \
--ref-version 38 \
--filter-set-name ~{filter_set_name} \
-mode SNP \
-V ~{snp_recal_file} \
-O ~{filter_set_name}.snps.recal.tsv

gatk --java-options "-Xmx4g" \
gatk --java-options "-Xmx1g" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

omitFromCommandLine = true
)
public final class CreateSiteFilteringFiles extends VariantWalker {
static final Logger logger = LogManager.getLogger(CreateVariantIngestFiles.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this use CreateSiteFilteringFiles.class instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup


private SimpleXSVWriter writer;

private List<String> HEADER =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for consistency with other tools, move this to SchemaUtils ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my future self thanks you

Copy link
Member

@mmorgantaylor mmorgantaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, only minor comments/questions

@@ -234,6 +252,14 @@ private void processVQSRRecordForPosition(ExtractFeaturesRecord rec) {
}
builder.attribute(GATKVCFConstants.EXCESS_HET_KEY, String.format("%.3f", excessHetApprox));

if (rec.getDistinctAlleles() > 6) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a constant defined somewhere that we can use? (or should we make it one?)

@@ -75,6 +75,9 @@ public static VCFFormatHeaderLine getEquivalentFormatHeaderLine(final String inf
addFilterLine(new VCFFilterHeaderLine(VCFConstants.PASSES_FILTERS_v4, "Site contains at least one allele that passes filters"));

addFilterLine(new VCFFilterHeaderLine(NAY_FROM_YNG, "Considered a NAY in the Yay, Nay, Grey table"));
addFilterLine(new VCFFilterHeaderLine(EXCESS_ALLELES, "Site has an excess of alternate alleles"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the excess alleles number (6) was a constant somewhere, you could put that into the header. (maybe users don't care, but i would personally find it helpful!)

@kcibul kcibul merged commit 4fbcc6f into ah_var_store Apr 12, 2021
@kcibul kcibul deleted the kc_site_qc branch April 12, 2021 18:53
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants