Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinjected sequences on matches containing inserts are invalid according to samtools #109

Closed
unode opened this issue Jun 6, 2019 · 0 comments · Fixed by #111
Closed

Comments

@unode
Copy link
Member

unode commented Jun 6, 2019

The issue is captured in the test-case added in 1068e29.

It basically boils down to samtools considering I in CIGAR strings to add to the length of the sequence. As such a CIGAR 5M1I5M is considered to have length 11 while current code calculates 10 due to ignoring I in:

matchSize' includeSoft cigar

unode added a commit that referenced this issue Jun 7, 2019
unode added a commit that referenced this issue Jun 7, 2019
unode added a commit that referenced this issue Jun 7, 2019
unode added a commit that referenced this issue Jun 7, 2019
luispedro added a commit that referenced this issue Jan 20, 2020
Many changes to the language and the internal code warrant a new
release:

Full ChangeLog
	* Fix CIGAR interpretation (#109) occurring when I is present
	* Call bwa mem so that it behaves in a deterministic way (independently of
	the number of threads used)
	* Add `include_fragments` option to orf_find
	* Add early check for column headers in `count()`
	* Add ``sense`` argument to `count()`
	* Add line numbers to FastQ parsing errors
	* Fix __extra_args argument in map()
	* Add `discard_singles` function
	* Add `interleaved` option to fastq()
	* `load_mocat_sample` now fails if `pair.2` exists but `pair.1` doesn't
	* Reintroduce zstd compression (after fixes upstream)
luispedro added a commit that referenced this issue Jan 22, 2020
Many changes to the language and the internal code warrant a new
release:

Full ChangeLog
	* Fix CIGAR interpretation (#109) occurring when I is present
	* Call bwa mem so that it behaves in a deterministic way (independently of
	the number of threads used)
	* Add `include_fragments` option to orf_find
	* Add early check for column headers in `count()`
	* Add ``sense`` argument to `count()`
	* Add line numbers to FastQ parsing errors
	* Fix __extra_args argument in map()
	* Add `discard_singles` function
	* Add `interleaved` option to fastq()
	* `load_mocat_sample` now fails if `pair.2` exists but `pair.1` doesn't
	* Reintroduce zstd compression (after fixes upstream)
luispedro added a commit that referenced this issue Feb 1, 2020
Many changes to the language and the internal code warrant a new
release:

Full ChangeLog
	* Fix CIGAR interpretation (#109) occurring when I is present
	* Call bwa mem so that it behaves in a deterministic way (independently of
	the number of threads used)
	* Add `include_fragments` option to orf_find
	* Add early check for column headers in `count()`
	* Add ``sense`` argument to `count()`
	* Add line numbers to FastQ parsing errors
	* Fix __extra_args argument in map()
	* Add `discard_singles` function
	* Add `interleaved` option to fastq()
	* `load_mocat_sample` now fails if `pair.2` exists but `pair.1` doesn't
	* Reintroduce zstd compression (after fixes upstream)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant