Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't anchor the Sanger trace in the selected reference genome. error in tracy but not in Indigo #79

Open
jacorvar opened this issue May 4, 2023 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@jacorvar
Copy link

jacorvar commented May 4, 2023

Hi,

I'm experiencing a very similar issue to #34 .

We've sequenced a chunk of the human MBL2 gene of ~250 nt. However, the machine sequences almost 1000 nucleotides; therefore, most of the sequence in the ab1 file is just rubbish.
For that reason I've decided to set -q 50 -u 750, so that the first 50 low-quality bases and the last 750 (false) bases are excluded.
I tried first with the whole GATK GRCh38 genome and got the error Couldn't anchor the Sanger trace in the selected reference genome. when running both Indigo (setting left and right trim sizes to 50 and 750, respectively) and tracy in the command line as follows:

tracy decompose -o forward -r Homo_sapiens_assembly38.fasta.gz -q 50 -u 750 MF-102_MBL2.ab1
[2023-May-04 12:10:37] tracy decompose -o forward -a homo_sapiens -r Homo_sapiens_assembly38.fasta.gz -q 50 -u 750 MF-102_2MBL2.ab1 
[2023-May-04 12:10:37] Load ab1 file
[2023-May-04 12:10:37] Find Reference Match
[2023-May-04 12:10:37] Load FM-Index
Couldn't anchor the Sanger trace in the selected reference genome.

As you pointed out here, that issue could be circumvented using a shorter sequence as a reference file.

Then I downloaded and indexed the fasta file for the MBL2 gene and repeated the process with the same parameters. Although it works well now with Indigo, tracy still fails with the same error message in the command line.

I'm using tracy v0.7.5 singularity container in CentOS 7.9.

@tobiasrausch
Copy link
Member

Indigo just runs tracy in the backend:

tracy decompose -v -r MBL2.gene.slice.fa -q 50 -u 750 -p 0.3 MF-102_MBL2.ab1

@jacorvar
Copy link
Author

jacorvar commented May 5, 2023

OK, that works now

BTW, given that I must provide the gene sequence instead of the whole genome/chromosome sequence in order to anchor the trace, which strategy would you suggest to annotate (adding RS identifiers) the bcf provided its positions are related to the gene and not to the genome?
I know I could do that using awk and some scripting, but I'd like to know if there's a more elegant and (possibly) better alternative.

Thanks a lot

@tobiasrausch
Copy link
Member

I am afraid we don't have a better solution for the annotation with RS identifiers with a gene sequence as input.

@tobiasrausch tobiasrausch self-assigned this May 12, 2023
@tobiasrausch tobiasrausch added the question Further information is requested label May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants