Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel Align Split FASTQs Has No Checking Script for Failures #8

Open
DarioS opened this issue Jul 13, 2021 · 0 comments
Open

Parallel Align Split FASTQs Has No Checking Script for Failures #8

DarioS opened this issue Jul 13, 2021 · 0 comments

Comments

@DarioS
Copy link

DarioS commented Jul 13, 2021

It might be valuable to have a script to look for failed alignments and create a new input file based on those (unless you decide to change everything to one whole sample per computing node design). I did a run yesterday afternoon where many (but not all) of the tasks failed with error message like:

[gadi-cpu-clx-1277:315845:0:315991] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x2837de0)
[1626069937.336272947] [gadi-cpu-clx-0731.gadi.nci.org.au] [00:S] Task 4 was terminated by signal 11: ./align.sh Fastq_split/101.OSCC_30-P_HC5GHDSX2_S1_L001,OSCC_30-P,RamaciottiCFG,1a,illumina,HC5GHDSX2,1,8

Copying and pasting the command onto the login node and running bwa there revealed that the task above finished successfully and created a BAM file. Ben Menadue thought it was a case of "memory pressure" on the node at the time the anaysis ran. Today, submitting the same job with identical settings worked fine. I only changed the names of the output and error log files.

Given how low the address is which it attempted to access (0x2837de0 = 42171872 ~ 40MiB), my immediate suspicion is that an attempt to allocate memory has failed but your program hasn't checked for that. For example, malloc returns an address of NULL (i.e. 0) on failure, which can easily result in low-address segmentation faults if the program just assumes that the malloc succeeded. In such a situation, the bug in the program is that it didn't consider the case where the call to malloc failed, and just blindly used the return value even though it wasn't a valid address.

It seems that bwa can sometimes do some bad memory allocations, which the pipeline (nor bwa) guard against.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant