Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tb-profiler update_tbdb --match_ref fails #337

Closed
schorlton-bugseq opened this issue Mar 26, 2024 · 11 comments
Closed

tb-profiler update_tbdb --match_ref fails #337

schorlton-bugseq opened this issue Mar 26, 2024 · 11 comments

Comments

@schorlton-bugseq
Copy link

On v6.2.0, I run:
tb-profiler update_tbdb --match_ref myref.fna

I get:

ValueError: Command Failed:
/bin/bash -c set -o pipefail; tb-profiler create_db --prefix tbdb --csv mutations.csv --watchlist watchlist.csv --rules rules.txt --match_ref /test/tbdb/myref.fna --load

...

File "pysam/libcfaidx.pyx", line 121, in pysam.libcfaidx.FastaFile.__cinit__
  File "pysam/libcfaidx.pyx", line 153, in pysam.libcfaidx.FastaFile._open
OSError: file `/test/tbdb/myref.fna` not found

It looks like it's looking for the ref in tbdb dir and not the parent where it is located.

Thanks for your help and tool!

@jodyphelan
Copy link
Owner

Hi @schorlton-bugseq

Ah I think you found a bug there. Try add the full path to your reference file and it should work. I'll patch this in the next release.

@WhalleyT
Copy link

WhalleyT commented May 1, 2024

Hi! When I supply the full path it does fix that particular error but I get an error downstream. The error is a keyError where the process is trying to look for the header of my reference and it is missing e.g. for a fasta:

>test
ATGGC

gives the error:

Traceback (most recent call last):
  File "/home/tom/micromamba/bin/tb-profiler", line 583, in <module>
    args.func(args)
  File "/home/tom/micromamba/bin/tb-profiler", line 242, in main_create_db
    pp.create_db(args,extra_files=extra_files)
  File "/home/tom/micromamba/lib/python3.10/site-packages/pathogenprofiler/db.py", line 505, in create_db
    write_bed(
  File "/home/tom/micromamba/lib/python3.10/site-packages/pathogenprofiler/db.py", line 120, in write_bed
    if genome_end > chrom_lengths[gene_info[gene].chrom]:
KeyError: 'test'

This can be rectified by renaming the header to match the original tbprofiler reference (>chromosome).

Thank you!

@jodyphelan
Copy link
Owner

Just checking - are using this refrence genome: https://www.ncbi.nlm.nih.gov/nuccore/NC_000962.3?

@WhalleyT
Copy link

WhalleyT commented May 1, 2024

Yes it is, it is also the same number of BP as the original TBProfiler reference

@vrennie
Copy link

vrennie commented May 6, 2024

Hi @jodyphelan I get the same KeyError when trying to use the --match_ref flag

@jodyphelan
Copy link
Owner

Ok it looks like the issue arises when tb-profiler update_tbdb is run first without --match_ref and then with. Try removing the tbdb directory is downloaded and then run your tb-profiler update_tbdb --match_ref /path/to/ref.fa and see if that works.

@vrennie
Copy link

vrennie commented May 7, 2024

HI @jodyphelan I gave this a try but I run against the same error

Thanks

jodyphelan added a commit to jodyphelan/pathogen-profiler that referenced this issue May 8, 2024
@WhalleyT
Copy link

WhalleyT commented May 8, 2024

It still caused a (different error) when I ran it, but I was able to get it work.

if my reference is in ~/reference.fa I ran tb-profiler update_tbdb --match_ref reference.fa --commit <tbdb_commit> which then goes on to create ~/tbdb. I get an OSError: FileNotFound and mv reference.fa tbdb and run it again and it seems to work. If I remember this workaround didn't work when I tried it previously.

@jodyphelan
Copy link
Owner

Oh yeah it requires the full path to the reference file in the release version but this is fixed in 1e4c872

@WhalleyT
Copy link

WhalleyT commented May 8, 2024

Sorry yes you're right, I tried that originally and forgot when I tried it with the new update. It seems to be working now. Thanks 😄

@jodyphelan
Copy link
Owner

Great! will close this now but if there are any more related issues feel free to reopen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants