You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user found an issue when running --genotype-germline-sites with Mutect2 and found that the clustered events and haplotype filters were causing false negative results.
This request was created from a contribution made by TA on July 26, 2021 18:24 UTC.
Hi, I have a few technical questions about changes in filtered variants when running mutect2 with -genotype-germline-sites.
I ran mutect2 on matched tumor-normal data with and without -genotype-germline-sites. Everything else about these runs was the same.
When I compared the output vcfs I noticed differences in which variants pass all filters between the two different runs. Each run had unique variants that only passed - i.e. some variants were marked as pass when mutect2 was run with -genotype-germline-sites that failed when run with standard settings, and vice-versa.
When I looked through these variants I noticed two different patterns of unique variants:
Unique PASS variants to genotype-germline: the unique variants that PASSED in genotypegermline but were rejected in standard analysis failed in the standard run because of the "strand_bias" filter. The "strand_bias" filter marks more variants in the standard analysis than in the genotypegermline analysis. Looking through these variants on IGV, they look like they are false positives and for some reason when you run mutect2 with --genotypegermlinesites it prevents this filter from accurately working.
Unique PASS variants to standard: These variants were all rejected in genotypegermline but passed in standard mutect2 failed because of haplotype or clustered_events. I believe this is a potential problem with --genotypegermlinesites because when you include germlinesites, bona fide somatic variants that happen to be close to germline sites get filtered (when you run genotype germline you are more likely to include the germline variant in the activeregion of a somatic variant because you create an active region around the germline variant in addition to the somatic variant). It seems like if you run -genotypegermline sites you will have false negatives and miss these somatic variants because they get filtered.
These are not an insubstantial number of variants - -genotypegermline sites returned 3910 PASS variants, and there were 123 variants that failed genotypegermline sites but passed in standard mutect2 just because they failed the haplotype or clustered_events (likely false negatives).
Do you have any suggestions for how to get around these two issues? One way I can think of to get around the second issue is to ignore the haplotype or clustered_event filters when running --genotypegermlinesites, but this would have the effect of introducing false positives in the variant call. Is there a way to increase the number of nearby events that trigger the haplotype/clustered _events filters? Changing this could also restore the false negatives. I am not sure how to solve the issue in which the strand_bias filter stops working as well when running -genotypegermlinesites.
We're encountering this issue (because of experimental design we need to postprocess germline-flagged variants and therefore emit/genotype all germline variants). Has there been any progress on this front?
A user found an issue when running --genotype-germline-sites with Mutect2 and found that the clustered events and haplotype filters were causing false negative results.
This request was created from a contribution made by TA on July 26, 2021 18:24 UTC.
Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/4404184803227-Mutect2-genotype-germline-sites-filtering-discrepancy-
--
Hi, I have a few technical questions about changes in filtered variants when running mutect2 with -genotype-germline-sites.
I ran mutect2 on matched tumor-normal data with and without -genotype-germline-sites. Everything else about these runs was the same.
When I compared the output vcfs I noticed differences in which variants pass all filters between the two different runs. Each run had unique variants that only passed - i.e. some variants were marked as pass when mutect2 was run with -genotype-germline-sites that failed when run with standard settings, and vice-versa.
When I looked through these variants I noticed two different patterns of unique variants:
Unique PASS variants to genotype-germline: the unique variants that PASSED in genotypegermline but were rejected in standard analysis failed in the standard run because of the "strand_bias" filter. The "strand_bias" filter marks more variants in the standard analysis than in the genotypegermline analysis. Looking through these variants on IGV, they look like they are false positives and for some reason when you run mutect2 with --genotypegermlinesites it prevents this filter from accurately working.
Unique PASS variants to standard: These variants were all rejected in genotypegermline but passed in standard mutect2 failed because of haplotype or clustered_events. I believe this is a potential problem with --genotypegermlinesites because when you include germlinesites, bona fide somatic variants that happen to be close to germline sites get filtered (when you run genotype germline you are more likely to include the germline variant in the activeregion of a somatic variant because you create an active region around the germline variant in addition to the somatic variant). It seems like if you run -genotypegermline sites you will have false negatives and miss these somatic variants because they get filtered.
These are not an insubstantial number of variants - -genotypegermline sites returned 3910 PASS variants, and there were 123 variants that failed genotypegermline sites but passed in standard mutect2 just because they failed the haplotype or clustered_events (likely false negatives).
Do you have any suggestions for how to get around these two issues? One way I can think of to get around the second issue is to ignore the haplotype or clustered_event filters when running --genotypegermlinesites, but this would have the effect of introducing false positives in the variant call. Is there a way to increase the number of nearby events that trigger the haplotype/clustered _events filters? Changing this could also restore the false negatives. I am not sure how to solve the issue in which the strand_bias filter stops working as well when running -genotypegermlinesites.
Thank you!
(created from Zendesk ticket #171745)
gz#171745
The text was updated successfully, but these errors were encountered: