Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutect2 genotype-germline-sites filtering discrepancy #7391

Closed
GATKSupportTeam opened this issue Aug 3, 2021 · 1 comment · Fixed by #8717
Closed

Mutect2 genotype-germline-sites filtering discrepancy #7391

GATKSupportTeam opened this issue Aug 3, 2021 · 1 comment · Fixed by #8717

Comments

@GATKSupportTeam
Copy link
Collaborator

A user found an issue when running --genotype-germline-sites with Mutect2 and found that the clustered events and haplotype filters were causing false negative results.

This request was created from a contribution made by TA on July 26, 2021 18:24 UTC.

Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/4404184803227-Mutect2-genotype-germline-sites-filtering-discrepancy-

--

Hi, I have a few technical questions about changes in filtered variants when running mutect2 with -genotype-germline-sites.

I ran mutect2 on matched tumor-normal data with and without -genotype-germline-sites. Everything else about these runs was the same. 

When I compared the output vcfs I noticed differences in which variants pass all filters between the two different runs. Each run had unique variants that only passed - i.e. some variants were marked as pass when mutect2 was run with -genotype-germline-sites that failed when run with standard settings, and vice-versa. 

When I looked through these variants I noticed two different patterns of unique variants:

Unique PASS variants to genotype-germline: the unique variants that PASSED in genotypegermline but were rejected in standard analysis failed in the standard run because of the "strand_bias" filter. The "strand_bias" filter marks more variants in the standard analysis than in the genotypegermline analysis. Looking through these variants on IGV, they look like they are false positives and for some reason when you run mutect2 with --genotypegermlinesites it prevents this filter from accurately working.

Unique PASS variants to standard: These variants were all rejected in genotypegermline but passed in standard mutect2 failed because of haplotype or clustered_events. I believe this is a potential problem with --genotypegermlinesites because when you include germlinesites, bona fide somatic variants that happen to be close to germline sites get filtered (when you run genotype germline you are more likely to include the germline variant in the activeregion of a somatic variant because you create an active region around the germline variant in addition to the somatic variant). It seems like if you run -genotypegermline sites you will have false negatives and miss these somatic variants because they get filtered. 

These are not an insubstantial number of variants - -genotypegermline sites returned 3910 PASS variants, and there were 123 variants that failed genotypegermline sites but passed in standard mutect2 just because they failed the haplotype or clustered_events (likely false negatives).

Do you have any suggestions for how to get around these two issues? One way I can think of to get around the second issue is to ignore the haplotype or clustered_event filters when running --genotypegermlinesites, but this would have the effect of introducing false positives in the variant call. Is there a way to increase the number of nearby events that trigger the haplotype/clustered _events filters? Changing this could also restore the false negatives. I am not sure how to solve the issue in which the strand_bias filter stops working as well when running -genotypegermlinesites. 

Thank you!

(created from Zendesk ticket #171745)
gz#171745

@lculibrk
Copy link

Hi GATK Team,

We're encountering this issue (because of experimental design we need to postprocess germline-flagged variants and therefore emit/genotype all germline variants). Has there been any progress on this front?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants