Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slimming step issue #196

Closed
bishopia opened this issue Sep 15, 2022 · 15 comments
Closed

slimming step issue #196

bishopia opened this issue Sep 15, 2022 · 15 comments

Comments

@bishopia
Copy link

I'm having trouble getting the slimming step to work, so i'm not getting assembly graph output. I seem to get similar results whether I use a container with singularity, or conda environment. stdout looks like this:

GetOrganelle v1.7.6.1

get_organelle_from_reads.py assembles organelle genomes from genome skimming data.
Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information.

Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:23:11)  [GCC 9.4.0]
PLATFORM: Linux node1640.oscar.ccv.brown.edu 3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019 x86_64 x86_64
PYTHON LIBS: GetOrganelleLib 1.7.6.1; numpy 1.21.2; sympy 1.9; scipy 1.7.1
DEPENDENCIES: Bowtie2 2.4.4; SPAdes 3.15.3; Blast 2.12.0
GETORG_PATH=/users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle
LABEL DB: embplant_mt 0.0.1; embplant_pt 0.0.1
WORKING DIR: /gpfs/scratch/ibishop/mitohifi/Aact_MW413902_ref/collect/getorganelle
/gpfs/home/ibishop/data/ibishop/condas/getOrganelle/bin/get_organelle_from_reads.py -s Aact_MW413902.1.fasta -1 all_R1.fastq -2 all_R2.fastq -o mitome_v2 -F embplant_mt -t 20 --config-dir /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle --continue

2022-09-15 00:03:52,937 - INFO: Extending ... skipped.

2022-09-15 00:03:52,939 - INFO: Separating extended fastq file ... skipped.

2022-09-15 00:03:52,940 - INFO: Assembling using SPAdes ... skipped.

2022-09-15 00:06:55,231 - ERROR: Slimming mitome_v2/extended_spades/K115/assembly_graph.fastg failed. Please check mitome_v2/extended_spades/K115/slim.log.txt for details. 
2022-09-15 00:06:55,258 - ERROR: No valid assembly graph found!

Total cost 192.27 s
Thank you!

and the slim log looks like this:

(miniconda3)[ibishop@login006 getorganelle]$ cat mitome_v2/extended_spades/K115/slim.log.txt
2022-09-14 22:39:36,362 - INFO: Slimming file 1/1: /mnt/mitome_v2/extended_spades/K115/assembly_graph.fastg
2022-09-14 22:41:02,462 - INFO: Parsing input finished.
2022-09-14 22:41:02,670 - INFO: Preparing fasta file finished.
2022-09-14 22:41:02,671 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_mt ...
2022-09-14 22:41:02,671 - INFO: Executing BLAST ...
2022-09-14 22:41:11,336 - INFO: Executing BLAST finished.
2022-09-14 22:41:11,336 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_mt finished.
2022-09-14 22:41:11,337 - INFO: Parsing blast result finished.
2022-09-14 22:41:11,337 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_pt ...
2022-09-14 22:41:11,337 - INFO: Executing BLAST ...
2022-09-14 22:41:20,011 - INFO: Executing BLAST finished.
2022-09-14 22:41:20,012 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_pt finished.
2022-09-14 22:41:20,012 - INFO: Parsing blast result finished.
2022-09-14 22:41:20,012 - INFO: No enough coverage information found.
2022-09-14 22:41:20,012 - INFO: Mapping names ...
2022-09-14 22:42:46,184 - INFO: Mapping names finished.
2022-09-14 22:42:46,185 - INFO: Generating slimmed file to /mnt/mitome_v2/extended_spades/K115/assembly_graph.fastg.extend-embplant_mt-embplant_pt.fastg
2022-09-14 22:42:46,185 - ERROR: 
'NoneType' object has no attribute 'write_fasta'
2022-09-14 22:42:46,185 - ERROR: Slimming file 1/1: /mnt/mitome_v2/extended_spades/K115/assembly_graph.fastg failed!

2022-09-14 23:00:08,919 - INFO: Slimming file 1/1: /mnt/mitome_v2/extended_spades/K115/assembly_graph.fastg
2022-09-14 23:01:36,166 - INFO: Parsing input finished.
2022-09-14 23:01:36,375 - INFO: Preparing fasta file finished.
2022-09-14 23:01:36,376 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_mt ...
2022-09-14 23:01:36,376 - INFO: Executing BLAST ...
2022-09-14 23:01:45,652 - INFO: Executing BLAST finished.
2022-09-14 23:01:45,652 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_mt finished.
2022-09-14 23:01:45,652 - INFO: Parsing blast result finished.
2022-09-14 23:01:45,652 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_pt ...
2022-09-14 23:01:45,653 - INFO: Executing BLAST ...
2022-09-14 23:01:55,518 - INFO: Executing BLAST finished.
2022-09-14 23:01:55,519 - INFO: Executing BLAST to /mnt/LabelDatabase/embplant_pt finished.
2022-09-14 23:01:55,519 - INFO: Parsing blast result finished.
2022-09-14 23:01:55,519 - INFO: No enough coverage information found.
2022-09-14 23:01:55,519 - INFO: Mapping names ...
2022-09-14 23:03:22,255 - INFO: Mapping names finished.
2022-09-14 23:03:22,255 - INFO: Generating slimmed file to /mnt/mitome_v2/extended_spades/K115/assembly_graph.fastg.extend-embplant_mt-embplant_pt.fastg
2022-09-14 23:03:22,255 - ERROR: 
'NoneType' object has no attribute 'write_fasta'
2022-09-14 23:03:22,255 - ERROR: Slimming file 1/1: /mnt/mitome_v2/extended_spades/K115/assembly_graph.fastg failed!

2022-09-15 00:03:58,386 - INFO: Slimming file 1/1: mitome_v2/extended_spades/K115/assembly_graph.fastg
2022-09-15 00:05:19,811 - INFO: Parsing input finished.
2022-09-15 00:05:19,997 - INFO: Preparing fasta file finished.
2022-09-15 00:05:19,997 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_mt ...
2022-09-15 00:05:19,998 - INFO: Executing BLAST ...
2022-09-15 00:05:27,700 - INFO: Executing BLAST finished.
2022-09-15 00:05:27,701 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_mt finished.
2022-09-15 00:05:27,701 - INFO: Parsing blast result finished.
2022-09-15 00:05:27,701 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_pt ...
2022-09-15 00:05:27,701 - INFO: Executing BLAST ...
2022-09-15 00:05:35,518 - INFO: Executing BLAST finished.
2022-09-15 00:05:35,518 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_pt finished.
2022-09-15 00:05:35,520 - INFO: Parsing blast result finished.
2022-09-15 00:05:35,520 - INFO: No enough coverage information found.
2022-09-15 00:05:35,520 - INFO: Mapping names ...
2022-09-15 00:06:55,082 - INFO: Mapping names finished.
2022-09-15 00:06:55,094 - INFO: Generating slimmed file to /gpfs/scratch/ibishop/mitohifi/Aact_MW413902_ref/collect/getorganelle/mitome_v2/extended_spades/K115/assembly_graph.fastg.extend-embplant_mt-embplant_pt.fastg
2022-09-15 00:06:55,094 - ERROR: 
'NoneType' object has no attribute 'write_fasta'
2022-09-15 00:06:55,094 - ERROR: Slimming file 1/1: mitome_v2/extended_spades/K115/assembly_graph.fastg failed!

2022-09-15 00:55:00,075 - INFO: Slimming file 1/1: mitome_v2/extended_spades/K115/assembly_graph.fastg
2022-09-15 00:56:26,784 - INFO: Parsing input finished.
2022-09-15 00:56:26,994 - INFO: Preparing fasta file finished.
2022-09-15 00:56:26,995 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_mt ...
2022-09-15 00:56:26,995 - INFO: Executing BLAST ...
2022-09-15 00:56:35,071 - INFO: Executing BLAST finished.
2022-09-15 00:56:35,071 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_mt finished.
2022-09-15 00:56:35,072 - INFO: Parsing blast result finished.
2022-09-15 00:56:35,072 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_pt ...
2022-09-15 00:56:35,072 - INFO: Executing BLAST ...
2022-09-15 00:56:43,210 - INFO: Executing BLAST finished.
2022-09-15 00:56:43,210 - INFO: Executing BLAST to /users/ibishop/scratch/mitohifi/Aact_MW413902_ref/collect/getorganelle/LabelDatabase/embplant_pt finished.
2022-09-15 00:56:43,210 - INFO: Parsing blast result finished.
2022-09-15 00:56:43,211 - INFO: No enough coverage information found.
2022-09-15 00:56:43,211 - INFO: Mapping names ...
2022-09-15 00:58:10,261 - INFO: Mapping names finished.
2022-09-15 00:58:10,262 - INFO: Generating slimmed file to /gpfs/scratch/ibishop/mitohifi/Aact_MW413902_ref/collect/getorganelle/mitome_v2/extended_spades/K115/assembly_graph.fastg.extend-embplant_mt-embplant_pt.fastg
2022-09-15 00:58:10,262 - ERROR: 
'NoneType' object has no attribute 'write_fasta'
2022-09-15 00:58:10,262 - ERROR: Slimming file 1/1: mitome_v2/extended_spades/K115/assembly_graph.fastg failed!

any ideas why this might be happening? thanks!

@Kinggerm
Copy link
Owner

Kinggerm commented Sep 15, 2022

Could you please send the fastg file to jianjun.jin AT columbia.edu?
I want to take a detailed look at what happened.

@VaninaTonzo
Copy link

Hi! I am having the same issue. Any update??

@Kinggerm
Copy link
Owner

Kinggerm commented Nov 11, 2022

@bishopia Sorry for the slow reply.

The slimming is running abnormally (a bug triggered by resulting in an empty graph, will be fixed in a recent update), but the result is correct in a biological sense. The assembly_graph.fastg does not hit any piece of the default embplant_mt database, so there should be No valid assembly graph found. I noticed that you are assembling the mitochondrion genome of algae, which is out of the scope of the GetOrganelle default database of embryophyte mitochondria (embplant_mt), and the reason for the no-hit. Currently, we don't have time and enough strength to build all related databases, even though they can be useful to many potential users. The top wanted ones may include non-embryophyte mt and animal nr (#136).

However, there exists a solution without a default database, which is using a custom database. For your convenience, I have made the label database from MW413902.1.gb (the same GenBank record as your seed database) for you. Please find it in the email. In case you want to create your own databases in the future, see: https://github.com/Kinggerm/GetOrganelle/wiki/FAQ#how-to-assemble-a-target-organelle-genome-using-my-own-reference.

To finish your run:

get_organelle_from_assembly.py -g /mnt/mitome_v2/extended_spades/K115/assembly_graph.fastg -F embplant_mt --genes Aact_MW413902.1.cds.fasta -o output_dir
# -F embplant_mt: we are still going to use the embplant_mt mode because it may be similar in some parameters setting
# --genes Aact_MW413902.1.cds.fasta: the key to input a custom database

The result is still not circularized. You may use the current output together with Aact_MW413902.fasta as a new seed database and fine-tune the parameters in get_organelle_from_reads.py with a new run.

Please let me know your updates.

@Kinggerm
Copy link
Owner

Kinggerm commented Nov 11, 2022

Hi! I am having the same issue. Any update??

@VaninaTonzo Please provide the log files. Thanks!

@VaninaTonzo
Copy link

Hi @Kinggerm, here I attach my log files. Thanks for your help!
spades.log
warnings.log
slim.log.txt
get_org.log.txt

@Kinggerm
Copy link
Owner

@VaninaTonzo I'm afraid that this dataset may not contain enough target reads according to the log file.

However, as the basic recipe noted, if you fail with the default database, rerun it using your own seed database (or the output of a first GetOrganelle run) and label database with "-s" and "--genes", which is similar to my reply to bishopia above.

Please also let me know your updates.

@Kinggerm
Copy link
Owner

The latest GetOrganelle update on GitHub fixed the bug of reporting 'NoneType' object has no attribute 'write_fasta', but your result of No valid assembly graph shall not change. @bishopia @VaninaTonzo You are welcome to check it out.

@RPIASRI
Copy link

RPIASRI commented Dec 21, 2022

I am also getting the error at slimming step, details provided below. Please help me in getting out of the same.

2022-12-20 15:39:14,265 - INFO: Pre-reading fastq ...
2022-12-20 15:39:14,265 - INFO: Estimating reads to use ... (to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf')
2022-12-20 15:39:14,448 - INFO: Tasting 100000+100000 reads ...
2022-12-20 15:40:56,264 - INFO: Tasting 500000+500000 reads ...
2022-12-20 15:43:09,175 - INFO: Estimating reads to use finished.
2022-12-20 15:45:47,233 - INFO: Counting read qualities ...
2022-12-20 15:45:47,396 - INFO: Identified quality encoding format = Sanger
2022-12-20 15:45:47,396 - INFO: Phred offset = 33
2022-12-20 15:45:47,399 - INFO: Trimming bases with qualities (0.00%): 33..33 !
2022-12-20 15:45:47,457 - INFO: Mean error rate = 0.0021
2022-12-20 15:45:47,458 - INFO: Counting read lengths ...
2022-12-20 15:48:25,800 - INFO: Mean = 155.1 bp, maximum = 161 bp.
2022-12-20 15:48:25,800 - INFO: Reads used = 56598288+56394756
2022-12-20 15:48:25,800 - INFO: Pre-reading fastq finished.

2022-12-20 15:48:25,800 - INFO: Making seed reads ...
2022-12-20 15:48:25,801 - INFO: Seed bowtie2 index existed!
2022-12-20 15:48:25,801 - INFO: Mapping reads to seed bowtie2 index ...
2022-12-20 16:37:03,206 - INFO: Mapping finished.
2022-12-20 16:37:03,208 - INFO: Seed reads made: KC.mito/seed/animal_mt.initial.fq (3255713 bytes)
2022-12-20 16:37:03,209 - INFO: Making seed reads finished.

2022-12-20 16:37:03,209 - INFO: Checking seed reads and parameters ...
2022-12-20 16:37:03,209 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s).
2022-12-20 16:37:03,209 - INFO: If the result graph is not a circular organelle genome,
2022-12-20 16:37:03,209 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run.
2022-12-20 16:37:06,653 - INFO: Pre-assembling mapped reads ...
2022-12-20 16:37:12,126 - ERROR: slimming the pre-assembled graph failed.
2022-12-20 16:37:12,146 - INFO: Pre-assembling mapped reads finished.
2022-12-20 16:37:12,146 - INFO: Estimated animal_mt-hitting base-coverage = 11.09
2022-12-20 16:37:12,751 - WARNING: Guessing that you are using too few data for assembling animal_mt!
2022-12-20 16:37:12,751 - WARNING: GetOrganelle is still trying ...
2022-12-20 16:37:12,752 - INFO: Estimated word size(s): 41
2022-12-20 16:37:12,752 - INFO: Setting '-w 41'
2022-12-20 16:37:12,752 - INFO: Setting '--max-extending-len inf'
2022-12-20 16:37:12,860 - INFO: Checking seed reads and parameters finished.

2022-12-20 16:37:12,860 - INFO: Making read index ...
2022-12-20 16:53:51,721 - INFO: 102894874 candidates in all 112993044 reads
2022-12-20 16:53:52,275 - INFO: Pre-grouping reads ...
2022-12-20 16:53:52,275 - INFO: Setting '--pre-w 41'
2022-12-20 16:54:04,695 - INFO: 200000/8491666 used/duplicated
2022-12-20 16:54:40,625 - INFO: 4414 groups made.
2022-12-20 16:54:55,411 - INFO: Making read index finished.

2022-12-20 16:54:55,411 - INFO: Extending ...
2022-12-20 16:54:55,411 - INFO: Adding initial words ...
2022-12-20 16:54:55,496 - INFO: AW 64842
2022-12-20 17:00:54,409 - INFO: Round 1: 16771723/102894874 AI 3941420 AW 201191540
2022-12-20 17:00:54,410 - INFO: Hit the words limit and terminated ...
2022-12-20 17:00:54,410 - WARNING: Terminated at an insufficient number of rounds, see '--max-n-words'/'--max-extending-len' for more.
2022-12-20 17:04:30,919 - INFO: Extending finished.

2022-12-20 17:04:39,480 - INFO: Separating extended fastq file ...
2022-12-20 17:04:54,296 - WARNING: No paired reads found?!
2022-12-20 17:04:54,737 - INFO: Setting '-k 21,55,85,115'
2022-12-20 17:04:54,737 - INFO: Assembling using SPAdes ...
2022-12-20 17:04:54,889 - INFO: spades.py -t 1 --phred-offset 33 --s1 KC.mito/extended_1_unpaired.fq -k 21,55,85,115 -o KC.mito/extended_spades
2022-12-20 21:45:07,019 - INFO: Assembling finished.

2022-12-20 21:45:42,991 - ERROR: Slimming KC.mito/extended_spades/K115/assembly_graph.fastg failed. Please check KC.mito/extended_spades/K115/slim.log.txt for details.
2022-12-20 21:45:44,412 - ERROR: Slimming KC.mito/extended_spades/K85/assembly_graph.fastg failed. Please check KC.mito/extended_spades/K85/slim.log.txt for details.
2022-12-20 21:45:45,838 - ERROR: Slimming KC.mito/extended_spades/K55/assembly_graph.fastg failed. Please check KC.mito/extended_spades/K55/slim.log.txt for details.
2022-12-20 21:45:45,839 - ERROR: No valid assembly graph found!

Total cost 22027.83 s

@Kinggerm
Copy link
Owner

@RPIASRI
You should always provide the complete log file! There is no running environment information!

@RPIASRI
Copy link

RPIASRI commented Dec 22, 2022

get_org.log.txt
Sir, kindly find the attached log file.

@RPIASRI
Copy link

RPIASRI commented Jan 4, 2023

Any update about the same.

@Kinggerm
Copy link
Owner

Kinggerm commented Jan 4, 2023

Thank you for the log file. The issue looks similar to the cases reported above. Please try the solutions I mentioned above. @RPIASRI

@wangzh-github
Copy link

@Kinggerm I'm having trouble getting the slimming step to work, could you please help me with this problem?
Here is the log file, but I cannot find the slim.log.txt file like above, so I attached also the screenshot of the server's running interface (Figure 1-3)
spades.log
warnings.log
get_org.log.txt
1
2
3

@JianjunJin
Copy link
Collaborator

JianjunJin commented Dec 4, 2023

@wangzh-github The slim.log.txt should be available at plastome_output/extended_spades/K121. Please check it out.
My guess is the failure of Blastn (probably similar to #208 ). Reinstalling it may fix the issue.

@wangzh-github
Copy link

Thanks! I updated the blast and the problem is well solved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants