Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Bad input: This tool assumes diplod genotype #7690

Open
1 of 3 tasks
LiviaMoura opened this issue Feb 23, 2022 · 4 comments
Open
1 of 3 tasks

[BUG] Bad input: This tool assumes diplod genotype #7690

LiviaMoura opened this issue Feb 23, 2022 · 4 comments
Assignees

Comments

@LiviaMoura
Copy link

LiviaMoura commented Feb 23, 2022

Bug Report

Affected tool(s) or class(es)

GnarlyGenotyper

Affected version(s)

Description

WDL joint genotyping using GnarlyGenotyper after ReblockGVCF (fixed on the snapshot above)

Steps to reproduce

Joint Genotyper wdl pipeline with "GatkJointGenotyping.useGnarlyGenotyper": true , samples from DRAGEN 3.8+

Expected behavior

Complete the pipeline

Actual behavior

Failing with diploid error on Sexual Chromosomes

Hello again everyone.
First of all, thank you @ldgauthier to send us that snapshot docker. It kind of solved reblock problem. As feedback here, I tried with the newest GATK version (4.2.5) as it modified ReblockGVCF, but it didn`t work.
Anyway, I have another issue here...
While I was using only one or few chromosomes, the pipeline with reblock + gnarly was working fine. Once I added all chromosomes I started to get this type of error (GnarlyGenotyper):

A USER ERROR has occurred: Bad input: This tool assumes diploid genotypes, but sample NA18668 has ploidy 1 at position chrY:2789135.

or


A USER ERROR has occurred: Bad input: This tool assumes diploid genotypes, but sample NA14734 has ploidy 1 at position chrX:36667858.

I checked every failed log, and it's all related to the sexual chromosomes. Any thought/tip about that?
ps.: From chr1 to chr22 it worked fine!

@ldgauthier
Copy link
Contributor

Hi @LiviaMoura

We don't have any haploid calling in the Broad production pipeline, so we never included that feature in Gnarly. (For chrY people typically filter out hets and then treat 0/0 as 0 and 1/1 as 1. chrX on males admittedly requires a little more finesse.) I can probably take a look next week. I'm not sure how much effort a fix would entail, but hopefully the haploid case is just a simpler version of the diploid case right? :-)

@LiviaMoura
Copy link
Author

LiviaMoura commented Feb 23, 2022

Hello @ldgauthier,
Thank you for your answer
I came up with this problem because we have many patients resulting from Dragen pipeline that we want to analyze. Right now we can use Hail to analyze all 22 Chr from Gnarly pipe, but the sexual Chr is missing, and they are important for many syndromes. We can try doing manual filtering as you mentioned, but I'm just pointing out this bug we came across. If you have any other tip/faq/"forums discussion" to share, I'd be thankful.

Anyway, I'll wait for any news regarding this topic. Let us know if something new pops up.

Best :)

@ldgauthier
Copy link
Contributor

Hi @LiviaMoura,
I'm working on this now and the results look good, but I've found I don't have a lot of haploid test data. Can you provide some haploid and diploid chrX data? Or you could try GnarlyGenotyper with the new docker: us.gcr.io/broad-dsde-methods/gatk_gnarly_for_haploids@sha256:64cb8745dfe617c70d99354a3d68d89aed15c973a694f1d1c656e04fdcfdc997

Also what are you doing for chrY? Are the females no-call all the way across the chromosome or are you only combining Y for males?

@LiviaMoura
Copy link
Author

LiviaMoura commented Apr 22, 2022

Hi @ldgauthier,
I'm sorry for my delay with this topic. I was preparing myself to defend my Ph.D. (successfully done) and I wasn't looking GitHub these days...
Anyway... we were using public human samples as input... There are 22 of them available on the web, we have the gvcfs on our servers generated by Dragen (these were generated by Dragen 3.6.3), but if you need them, I'll have to ask permission for sharing (let me know if it'd be easier for you, I must share them by email)

Some of them

NA02718
NA07891
NA08618
NA09834
NA11661
NA12217
NA12878
NA14234
NA14626
NA14734
NA17819
NA18668
NA18949
NA20381

Regarding chrY, in samples with XX karyotype, the DRAGEN pipeline makes a diploid variant call, but applies a "PloidyConflict" hard filter (and apparently all calls are either 0/0 or ./.), whereas the DRAGEN pipeline makes a haploid variant call for XY karyotype samples as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants