Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incomplete error handling #19

Open
brentp opened this issue Apr 18, 2017 · 5 comments
Open

incomplete error handling #19

brentp opened this issue Apr 18, 2017 · 5 comments

Comments

@brentp
Copy link

brentp commented Apr 18, 2017

hi, I'm using SeqLib via freebayes. I just unmounted the file system holding the bams that freebayes was accessing. but the program still had a 0 exit status.

I see this message:
[E::bgzf_read] bgzf_read_block error -1 after 1115 of 1202 bytes
which is correctly propagated by htslib (AFAICT).

My C++ is rusty/non-existent, but it looks like _Bam::load_read(BamRecord& r) in SeqLib is just using a non-zero return from sam_read1 as an indicator to end iteration without actually propagating the error.

So it appears that the iterator just stops after getting the -1 return value from sam_read1 and can never error. Am I mistaken?

Here is the code in question:

SeqLib/src/BamReader.cpp

Lines 305 to 355 in 770cd10

bool _Bam::load_read(BamRecord& r) {
// allocated the memory
bam1_t* b = bam_init1();
int32_t valid;
if (hts_itr.get() == NULL) {
valid = sam_read1(fp.get(), m_hdr.get_(), b);
if (valid < 0) {
#ifdef DEBUG_WALKER
std::cerr << "ended reading on null hts_itr" << std::endl;
#endif
//goto endloop;
bam_destroy1(b);
return false;
}
} else {
//changed to sam from hts_itr_next
// move to next region of bam
valid = sam_itr_next(fp.get(), hts_itr.get(), b);
}
if (valid < 0) { // read not found
do {
#ifdef DEBUG_WALKER
std::cerr << "Failed read, trying next region. Moving counter to " << m_region_idx << " of " << m_region.size() << " FP: " << fp_htsfile << " hts_itr " << std::endl;
#endif
// try next region, return if no others to try
++m_region_idx; // increment to next region
if (m_region_idx >= m_region->size()) {
bam_destroy1(b);
return false;
}
//goto endloop;
// next region exists, try it
SetRegion(m_region->at(m_region_idx));
valid = sam_itr_next(fp.get(), hts_itr.get(), b);
} while (valid <= 0); // keep trying regions until works
}
// if we got here, then we found a read in this BAM
empty = false;
next_read.assign(b); // assign the shared_ptr for the bam1_t
r = next_read;
return true;
}

@walaj
Copy link
Owner

walaj commented Apr 18, 2017

This is a good suggestion. The Open function for opening the BAM should return false if the BAM is unmounted, but I guess in your case if it's unmounted during the middle of the run, then I agree this should give a run-time error.

Looking over the htslib code, it looks like -1 is returned by sam_read1 for normal EOF, so really load_read should throw an error for values of sam_read1 of -2 or less. This should give a non-zero exit status in your situtation, unless the user wants to catch this error.

I can make the adjustment and check in. I think I still have a pending PR on Freebayes from November, so can add to that. I can re-ping Erik and see if he wants to incorporate.

@brentp
Copy link
Author

brentp commented Apr 18, 2017

I think you're right that -1 is EOF and other negative values are errors. That would be great to get this into SeqLib and then freebayes. We're using it in places where network failures can remove drive access and we need to know this happened.

@walaj
Copy link
Owner

walaj commented Apr 25, 2017

Hi Brent,

I fixed this in the latest SeqLib commit. I didn't update my FreeBayes PR, since there are already a number of adjustments and it was getting to be too much in one pass. If you want, you can just update the SeqLib sub-repos to the latest SeqLib commit in your freebayes directory, and rebuild to have this issue fixed.

After my latest freebayes PR goes through, I can issue another one to update the official freebayes version.

Testing after force-ejecting a volume, confirms an exit code of 134 with the updated SeqLib

[E::bgzf_read] bgzf_read_block error -1 after 0 of 4 bytes
status -2 
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: sam_read1 return status: -2 file: /Volumes/jwala/tmptmp.bam
Abort trap: 6

wmbb0-9c3:freebayes jwala$ echo $?
134

@brentp
Copy link
Author

brentp commented Apr 25, 2017

sweet! thanks for the quick fix.

@brentp
Copy link
Author

brentp commented May 14, 2017

Is the pending change to freebayes this one: freebayes/freebayes#338

If so, once that PR is up-to-date, we should ping that issue as this is fairly critical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants