Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug in dynamic read disqualification fixed #8171

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -109,12 +109,13 @@ public AlleleLikelihoods<GATKRead, Haplotype> computeReadLikelihoods(final List<
@Override
public ToDoubleFunction<GATKRead> log10MinTrueLikelihood(final double expectedErrorRate, final boolean capLikelihoods) {
final double log10ErrorRate = Math.log10(expectedErrorRate);
final double catastrophicErrorRate = Math.log10(fbargs.fillingValue);

final double catastrophicErrorRate = fbargs.fillingValue;
final double log10catastrophicErrorRate = Math.log10(fbargs.fillingValue);
return read -> {
final double maxErrorsForRead = capLikelihoods ? Math.max(MAX_ERRORS_FOR_READ_CAP, Math.ceil(read.getLength() * expectedErrorRate)) : Math.ceil(read.getLength() * expectedErrorRate);
final double maxCatastrophicErrorsForRead = capLikelihoods ? Math.max(MAX_CATASTROPHIC_ERRORS_FOR_READ_CAP, Math.ceil(read.getLength() * catastrophicErrorRate)) : Math.ceil(read.getLength() * catastrophicErrorRate);
return maxErrorsForRead * log10ErrorRate + maxCatastrophicErrorsForRead * catastrophicErrorRate;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that looks like a bug doesn't it...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure to apply this to the FlowBasedHMMEngine as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, looks like FlowBasedHMMEngine was good to begin with!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like FlowBasedHMMEngine was good

public ToDoubleFunction<GATKRead> log10MinTrueLikelihood(final double expectedErrorRate, final boolean capLikelihoods) {
        final double log10ErrorRate = Math.log10(expectedErrorRate);
        final double catastrophicErrorRate = Math.log10(fbargs.fillingValue);

        return read -> {
            final double maxErrorsForRead = Math.max(3.0, Math.ceil(read.getLength() * expectedErrorRate));
            final double maxCatastrophicErrorsForRead = Math.max(2.0, Math.ceil(read.getLength() * catastrophicErrorRate));
            return maxErrorsForRead * log10ErrorRate + maxCatastrophicErrorsForRead*catastrophicErrorRate;
        };
    }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure thats fixed... you are still multiplying Math.ceil(read.getLength() * catastrophicErrorRate)) where catestrophicErrorRate is in log10. I think you have to make the same log change you made for the FlowBasedAligner (indeed we should proablby have factored this better in the first place to reuse the code better but thats neither here nor there)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, will do!

final double maxCatastrophicErrorsForRead = capLikelihoods ? Math.max(MAX_CATASTROPHIC_ERRORS_FOR_READ_CAP, Math.ceil(read.getLength() * fbargs.fillingValue)) :
Math.ceil(read.getLength() * fbargs.fillingValue);
return maxErrorsForRead * log10ErrorRate + maxCatastrophicErrorsForRead * log10catastrophicErrorRate;
};
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ public ToDoubleFunction<GATKRead> log10MinTrueLikelihood(final double expectedEr

return read -> {
final double maxErrorsForRead = Math.max(3.0, Math.ceil(read.getLength() * expectedErrorRate));
final double maxCatastrophicErrorsForRead = Math.max(2.0, Math.ceil(read.getLength() * catastrophicErrorRate));
final double maxCatastrophicErrorsForRead = Math.max(2.0, Math.ceil(read.getLength() * fbargs.fillingValue));
return maxErrorsForRead * log10ErrorRate + maxCatastrophicErrorsForRead*catastrophicErrorRate;
};
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3859,7 +3859,7 @@ chr9 81151898 . G <NON_REF> . . END=81151898 GT:DP:GQ:MIN_DP:PL 0/0:46:94:46:0,9
chr9 81151899 . G <NON_REF> . . END=81151974 GT:DP:GQ:MIN_DP:PL 0/0:50:99:45:0,106,1800
chr9 81151975 . G <NON_REF> . . END=81151975 GT:DP:GQ:MIN_DP:PL 0/0:49:85:49:0,85,1477
chr9 81151976 . G <NON_REF> . . END=81152018 GT:DP:GQ:MIN_DP:PL 0/0:51:99:48:0,118,1800
chr9 81152019 . T A,<NON_REF> 67.64 . ASSEMBLED_HAPS=2;BaseQRankSum=-0.939;DP=49;ExcessHet=0.0000;FILTERED_HAPS=0;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=176400,49;ReadPosRankSum=-1.630 GT:AD:DP:GQ:PL:SB 0/1:24,25,0:49:40:75,0,40,1050,1008,1083:14,10,19,6
chr9 81152019 . T A,<NON_REF> 67.64 . ASSEMBLED_HAPS=2;BaseQRankSum=-0.939;DP=50;ExcessHet=0.0000;FILTERED_HAPS=0;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=180000,50;ReadPosRankSum=-1.630 GT:AD:DP:GQ:PL:SB 0/1:24,25,0:49:40:75,0,40,1050,1008,1083:14,10,19,6
chr9 81152020 . G <NON_REF> . . END=81152065 GT:DP:GQ:MIN_DP:PL 0/0:51:99:49:0,112,1800
chr9 81152066 . C <NON_REF> . . END=81152066 GT:DP:GQ:MIN_DP:PL 0/0:51:64:51:0,64,2104
chr9 81152067 . T <NON_REF> . . END=81152098 GT:DP:GQ:MIN_DP:PL 0/0:45:99:42:0,117,1724
Expand Down Expand Up @@ -5430,10 +5430,8 @@ chr9 81162397 . C <NON_REF> . . END=81162397 GT:DP:GQ:MIN_DP:PL 0/0:44:98:44:0,9
chr9 81162398 . A <NON_REF> . . END=81162403 GT:DP:GQ:MIN_DP:PL 0/0:45:99:44:0,107,1800
chr9 81162404 . A <NON_REF> . . END=81162404 GT:DP:GQ:MIN_DP:PL 0/0:45:91:45:0,91,1882
chr9 81162405 . C <NON_REF> . . END=81162430 GT:DP:GQ:MIN_DP:PL 0/0:47:99:45:0,120,1800
chr9 81162431 . G C,<NON_REF> 0 . ASSEMBLED_HAPS=14;BaseQRankSum=-1.085;DP=47;ExcessHet=0.0000;FILTERED_HAPS=11;MLEAC=0,0;MLEAF=0.00,0.00;MQRankSum=0.000;RAW_MQandDP=169200,47;ReadPosRankSum=-1.179;SUSP_NOISY_ADJACENT_TP_VARIANT GT:AD:DP:GQ:PGT:PID:PL:PS:SB 0|0:43,3,0:46:42:0|1:81162431_G_C:0,42,82,129,1802,169:81162431:22,21,0,3
chr9 81162432 . G <NON_REF> . . END=81162432 GT:DP:GQ:MIN_DP:PL 0/0:45:99:45:0,120,1800
chr9 81162433 . A AG,<NON_REF> 0 . ASSEMBLED_HAPS=14;BaseQRankSum=-2.096;DP=47;ExcessHet=0.0000;FILTERED_HAPS=11;MLEAC=0,0;MLEAF=0.00,0.00;MQRankSum=0.000;RAW_MQandDP=169200,47;ReadPosRankSum=-1.248;SUSP_NOISY_ADJACENT_TP_VARIANT GT:AD:DP:GQ:PGT:PID:PL:PS:SB 0|0:42,4,0:46:13:0|1:81162431_G_C:0,13,53,122,918,162:81162431:22,20,0,4
chr9 81162434 . G <NON_REF> . . END=81162477 GT:DP:GQ:MIN_DP:PL 0/0:45:99:43:0,101,1731
chr9 81162431 . G <NON_REF> . . END=81162431 GT:DP:GQ:MIN_DP:PL 0/0:51:59:51:0,59,1437
chr9 81162432 . G <NON_REF> . . END=81162477 GT:DP:GQ:MIN_DP:PL 0/0:49:99:47:0,105,1800
chr9 81162478 . C <NON_REF> . . END=81162479 GT:DP:GQ:MIN_DP:PL 0/0:47:97:47:0,97,1928
chr9 81162480 . A <NON_REF> . . END=81162487 GT:DP:GQ:MIN_DP:PL 0/0:46:99:45:0,105,1494
chr9 81162488 . T <NON_REF> . . END=81162488 GT:DP:GQ:MIN_DP:PL 0/0:45:91:45:0,91,1496
Expand Down Expand Up @@ -6896,7 +6894,7 @@ chr9 81173332 . T <NON_REF> . . END=81173332 GT:DP:GQ:MIN_DP:PL 0/0:27:62:27:0,6
chr9 81173333 . C <NON_REF> . . END=81173338 GT:DP:GQ:MIN_DP:PL 0/0:27:75:26:0,75,1125
chr9 81173339 . A <NON_REF> . . END=81173339 GT:DP:GQ:MIN_DP:PL 0/0:27:36:27:0,36,1094
chr9 81173340 . T <NON_REF> . . END=81173368 GT:DP:GQ:MIN_DP:PL 0/0:29:81:27:0,81,1209
chr9 81173369 . A G,<NON_REF> 67.64 . ASSEMBLED_HAPS=2;BaseQRankSum=-0.472;DP=28;ExcessHet=0.0000;FILTERED_HAPS=0;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=100800,28;ReadPosRankSum=0.645 GT:AD:DP:GQ:PL:SB 0/1:13,15,0:28:40:75,0,40,623,546,621:3,10,6,9
chr9 81173369 . A G,<NON_REF> 67.64 . ASSEMBLED_HAPS=2;BaseQRankSum=-0.537;DP=29;ExcessHet=0.0000;FILTERED_HAPS=0;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=104400,29;ReadPosRankSum=0.939 GT:AD:DP:GQ:PL:SB 0/1:14,15,0:29:40:75,0,40,623,588,663:3,11,6,9
chr9 81173370 . G <NON_REF> . . END=81173384 GT:DP:GQ:MIN_DP:PL 0/0:28:81:27:0,81,1190
chr9 81173385 . C <NON_REF> . . END=81173389 GT:DP:GQ:MIN_DP:PL 0/0:27:78:27:0,78,1170
chr9 81173390 . A <NON_REF> . . END=81173396 GT:DP:GQ:MIN_DP:PL 0/0:28:81:27:0,81,1124
Expand Down
Binary file not shown.
Loading