Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Novoplasty Completed Successfully but Did Not Produce Contigs in Output File #231

Open
meeranhussain opened this issue Jul 29, 2024 · 6 comments

Comments

@meeranhussain
Copy link

I recently used Novoplasty to assemble the mitogenome from short read data of Microtonus aethiopoides ecotypes. Although the process completed successfully, it did not produce any contigs. I initially assembled the mitogenomes using Flye on Oxford Nanopore Technology (ONT) long read data for eight samples and obtained circularized genomes with sizes ranging from 29-32kb, which is unusually large for insect mitogenomes. To validate these results, I tried using Novoplasty on the short read data from the same samples. Despite the successful run, Novoplasty did not generate any contigs. I expected Novoplasty to produce contigs to compare with the Flye assembly results. I also wrote in Biostar (https://www.biostars.org/p/9599074/) to find answers for large mitogenomes but didn't find useful suggestions to validate. I would appreciate any suggestions!

image

image

@ndierckx
Copy link
Owner

ndierckx commented Aug 5, 2024

Insects can have a long repetitive control region, so those lengths can be possible.
Can you send me that extended log file? Seems the assembly was already 25 kbp, not sure what went wrong
But it seems you have a long repetitive region so to have an accurate length of that region, best to rely on the Nanopore reads

@meeranhussain
Copy link
Author

Hi, thanks for your reply. This genome appears to have long repetitive regions, but I am also concerned about potential misassemblies. I say this because I verified long-read mitogenome assembly method on Calliphora sp ONT data (whose mitogenome is typically 15-16kb). However, using Flye with this method resulted in a 32kb circular contig, which raises concerns about misassemblies. Any suggestions you have would be helpful. I also tried NOVOPlasty with a small k-mer value, but it still didn't produce a circular contig. I’ve attached the log file for your reference.
log_extended_Maethio_13 (1).txt

@ndierckx
Copy link
Owner

ndierckx commented Aug 7, 2024

At least this assembly outputted the assembled sequence, but it is probably not possible to accurately assemble the complete genome with just short reads. Do you also have long reads for this sample?

@meeranhussain
Copy link
Author

Yes, I did try assemble using ONT reads but gave with long 32kb contig, with lot of repeats in control region, which I think is because of misassembly.

@ndierckx
Copy link
Owner

ndierckx commented Aug 7, 2024

If you have short and long reads from the species (preferably same sample), it should be easy to assemble. I do have an unpublished hybrid assembler that I used for another user before: https://www.mdpi.com/1422-0067/24/4/3976
Can't share the code yet but maybe I could run it for you

I have a new long read assembler I just put online, which works much better than Flye. I will create a Docker in the future because Perl modules can be annoying to install on a cluster, but it doesn't need any memory so maybe you can run it on your desktop or laptop: https://github.com/ndierckx/NOVOLoci

@meeranhussain
Copy link
Author

Thanks, that's so nice of you but I would first like to try your long read assembly (NOVOLoci), if it still doesn't work then will comeback to you for hybrid method

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants