Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GATK Tutorial#11682 reproduce different results #8884

Closed
yuweibao15 opened this issue Jun 21, 2024 · 2 comments
Closed

GATK Tutorial#11682 reproduce different results #8884

yuweibao15 opened this issue Jun 21, 2024 · 2 comments

Comments

@yuweibao15
Copy link

Brief issue description:
When following the tutorial https://gatk.broadinstitute.org/hc/en-us/articles/360035531092--How-to-part-I-Sensitively-detect-copy-ratio-alterations-and-allelic-segments, the #4 Plot standardized and denoised copy ratios with PlotDenoisedCopyRatios have different results than the tutorial. Through the control vectors test, it seems that the samples that are used in step #2 to generate CNV PON used in the tutorial are different from the files stored in the tutorial.
Results:
Following steps 1 to 4, the resulting plots
hcc1143_T_clean denoised
hcc1143_T_clean denoisedLimit4
The results have values However, the values in the tutorial are 0.134 and 0.125.
Tests
Using the files provided in the tutorial and script generated cnvponC.pon.hdf5, which seems to lead to this inconsistency result.
Using:
gatk --java-options "-Xmx6500m" CreateReadCountPanelOfNormals
-I HG00133.alt_bwamem_GRCh38DH.20150826.GBR.exome.counts.hdf5
-I HG00733.alt_bwamem_GRCh38DH.20150826.PUR.exome.counts.hdf5
-I NA19654.alt_bwamem_GRCh38DH.20150826.MXL.exome.counts.hdf5
--minimum-interval-median-percentile 5.0
-O sandbox/cnvponC.pon.hdf5
Files
The script used to generate this result are attached.
gatk_tutorial11682_issue.zip

Please help me understand this difference in reproducing the tutorial result. It will be extremely helpful for me to use the pipelines on our lab-generated data. Thank you very much!

@yuweibao15
Copy link
Author

yuweibao15 commented Jun 21, 2024

With further inspections, step #3 DenoiseReadCounts produced different values inside hcc1143_T_clean.standardizedCR.tsv and hcc1143_T_clean.denoisedCR.tsv compared to the results stored in the tutorial. This could lead to a possible cause of the file hcc1143_T_clean.counts.hdf5 provided in the tutorial.

@droazen
Copy link
Collaborator

droazen commented Jun 21, 2024

Hi @yuweibao15 , could you please post this question on the GATK forum (https://gatk.broadinstitute.org/hc/en-us/community/topics)? We'll be able to assist you there.

@droazen droazen closed this as completed Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants