Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FilterAlignmentArtifacts Segfaulting #7162

Open
michael-weinstein opened this issue Mar 25, 2021 · 5 comments
Open

FilterAlignmentArtifacts Segfaulting #7162

michael-weinstein opened this issue Mar 25, 2021 · 5 comments
Assignees

Comments

@michael-weinstein
Copy link

michael-weinstein commented Mar 25, 2021

Bug Report

Affected tool(s) or class(es)

FilterAlignmentArtifact

Affected version(s)

Docker container running 4.1.9.0

Description

Appears to be causing a segfault

Steps to reproduce

COMMAND

/gatk/gatk FilterAlignmentArtifacts --reference /home/gatk/references/Sars_cov_2.ASM985889v3.dna_sm.toplevel.fa.gz --variant /data/filteredVCF/in2510-8.orientationFilter.vcf --input /data/rawVCF/mutectBAM/in2510-8.mutect2.bam --bwa-mem-index-image /home/gatk/references/Sars_cov_2.ASM985889v3.dna_sm.toplevel.fa.img --output /data/alignmentArtifactFilteredVCF/in2510-8.orientationFilter.alignmentArtifactFilter.vcf

COPY OF SHELL SESSION

gatk@1ff04a9b2ba9:/home/gatk$ /gatk/gatk FilterAlignmentArtifacts --reference /home/gatk/references/Sars_cov_2.ASM985889v3.dna_sm.toplevel.fa.gz --variant /data/filteredVCF/in2510-8.orientationFilter.vcf --input /data/rawVCF/mutectBAM/in2510-8.mutect2.bam --bwa-mem-index-image /home/gatk/references/Sars_cov_2.ASM985889v3.dna_sm.toplevel.fa.img --output /data/alignmentArtifactFilteredVCF/in2510-8.orientationFilter.alignmentArtifactFilter.vcf
Using GATK jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar FilterAlignmentArtifacts --reference /home/gatk/references/Sars_cov_2.ASM985889v3.dna_sm.toplevel.fa.gz --variant /data/filteredVCF/in2510-8.orientationFilter.vcf --input /data/rawVCF/mutectBAM/in2510-8.mutect2.bam --bwa-mem-index-image /home/gatk/references/Sars_cov_2.ASM985889v3.dna_sm.toplevel.fa.img --output /data/alignmentArtifactFilteredVCF/in2510-8.orientationFilter.alignmentArtifactFilter.vcf
08:33:36.572 INFO  NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar!/com/intel/gkl/native/libgkl_utils.so
08:33:36.591 INFO  NativeLibraryLoader - Loading libgkl_smithwaterman.so from jar:file:/gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar!/com/intel/gkl/native/libgkl_smithwaterman.so
08:33:36.592 INFO  SmithWatermanAligner - Using AVX accelerated SmithWaterman implementation
08:33:36.826 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar!/com/intel/gkl/native/libgkl_compression.so
Mar 25, 2021 8:33:37 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
08:33:37.130 INFO  FilterAlignmentArtifacts - ------------------------------------------------------------
08:33:37.130 INFO  FilterAlignmentArtifacts - The Genome Analysis Toolkit (GATK) v4.1.9.0-SNAPSHOT
08:33:37.130 INFO  FilterAlignmentArtifacts - For support and documentation go to https://software.broadinstitute.org/gatk/
08:33:37.131 INFO  FilterAlignmentArtifacts - Executing as gatk@1ff04a9b2ba9 on Linux v5.4.72-microsoft-standard-WSL2 amd64
08:33:37.131 INFO  FilterAlignmentArtifacts - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
08:33:37.131 INFO  FilterAlignmentArtifacts - Start Date/Time: March 25, 2021 8:33:36 AM GMT
08:33:37.131 INFO  FilterAlignmentArtifacts - ------------------------------------------------------------
08:33:37.132 INFO  FilterAlignmentArtifacts - ------------------------------------------------------------
08:33:37.133 INFO  FilterAlignmentArtifacts - HTSJDK Version: 2.23.0
08:33:37.133 INFO  FilterAlignmentArtifacts - Picard Version: 2.23.3
08:33:37.133 INFO  FilterAlignmentArtifacts - HTSJDK Defaults.COMPRESSION_LEVEL : 2
08:33:37.133 INFO  FilterAlignmentArtifacts - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
08:33:37.133 INFO  FilterAlignmentArtifacts - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
08:33:37.133 INFO  FilterAlignmentArtifacts - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
08:33:37.134 INFO  FilterAlignmentArtifacts - Deflater: IntelDeflater
08:33:37.134 INFO  FilterAlignmentArtifacts - Inflater: IntelInflater
08:33:37.135 INFO  FilterAlignmentArtifacts - GCS max retries/reopens: 20
08:33:37.135 INFO  FilterAlignmentArtifacts - Requester pays: disabled
08:33:37.136 WARN  FilterAlignmentArtifacts -

   !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

   Warning: FilterAlignmentArtifacts is an EXPERIMENTAL tool and should not be used for production

   !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


08:33:37.136 INFO  FilterAlignmentArtifacts - Initializing engine
08:33:37.531 INFO  FeatureManager - Using codec VCFCodec to read file file:///data/filteredVCF/in2510-8.orientationFilter.vcf
08:33:37.586 INFO  FilterAlignmentArtifacts - Done initializing engine
08:33:37.668 INFO  NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
08:33:37.706 INFO  IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
08:33:37.707 INFO  IntelPairHmm - Available threads: 8
08:33:37.707 INFO  IntelPairHmm - Requested threads: 4
08:33:37.707 INFO  PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
08:33:37.708 INFO  ProgressMeter - Starting traversal
08:33:37.708 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007ff7b7dfe32d, pid=849, tid=0x00007ff82e11d700
#
# JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libgkl_smithwaterman5951765478004985534.so+0x132d]  smithWatermanBackTrack(dnaSeqPair*, int, int, int, int, int*, int)+0x1bd
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/gatk/hs_err_pid849.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

RELEVANT FILES
hs_err_pid100.log
hs_err_pid164.log
hs_err_pid274.log
hs_err_pid400.log
hs_err_pid482.log
hs_err_pid711.log
hs_err_pid735.log
hs_err_pid801.log
hs_err_pid825.log
hs_err_pid849.log
otherFiles.zip
in2510-8.orientationFilter.vcf.txt
VCF extension appended with .txt to satisfy GitHub's upload requirements

Expected behavior

Worked on 7 other files generated with the same pipeline.

Actual behavior

Unsure why this last one is causing a segfault. The VCF included is not the whole VCF submitted originally. I went cutting out lines from the original until I could isolate it down to a minimal set required to reproduce the crash (I included all of the crash logs generated in case it can help). I was expecting to find a single line or maybe two that were required to reproduce this issue, but that range appears to be needed. Eliminating either the first or last line from the range will make the program work again. Did not attempt to remove lines from the middle of the range yet to see if they're necessary to cause the fault, but it's 2am and I should probably sleep.

@michael-weinstein
Copy link
Author

Update: I did some more cutting on the VCF and I can reproducibly cause a segfault with only these 4 variants in the VCF. I am attaching the associated VCF and log files here. The other files should remain the same as above.
in2510-8.orientationFilter.vcf.txt
hs_err_pid1358.log

@droazen
Copy link
Collaborator

droazen commented Apr 12, 2021

@takutosato @davidbenjamin Could one you comment on this one?

@michael-weinstein
Copy link
Author

Please let me know if there is anything I can do for further information.

@davidbenjamin
Copy link
Contributor

@michael-weinstein This looks like an issue in our accelerated Pair-HMM implementation from Intel. PR #7105 is in code review and will offer a work-around once merged.

@tblewett
Copy link

I am running into the same issue with v4.2.0.0. Has this been fixed yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants