Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel module exits with error, if the data are not in the same directory as the script #141

Closed
vpuller opened this issue Dec 3, 2020 · 2 comments

Comments

@vpuller
Copy link

vpuller commented Dec 3, 2020

Parallel module exits with error, if the data are not in the same directory as the script. Running the parallel.ngl test exits with error, if data is moved into a subfolder: sample.fq -> samle/sample.fq. Seemingly unable to copy an existing file:

Exiting after fatal error:
�[31mAn unhandled erorr occurred (this should not happen)!

	If you can reproduce this issue, please run your script
	with the --trace flag and report a bug (including the script and the trace) at
		https://github.com/ngless-toolkit/ngless/issues

The error message was: `/home/puller/ngless_vadim/tmp/partial.compress.tsv18674-5.gz: renameFile:renamePath:rename: does not exist (No such file or directory)`
�[0m
[Thu 03-12-2020 10:02:11]: # Configuration
[Thu 03-12-2020 10:02:11]: 	download base URL: https://ngless.embl.de/resources/
[Thu 03-12-2020 10:02:11]: 	global data directory: /home/puller/ngless_vadim/
[Thu 03-12-2020 10:02:11]: 	user directory: /home/puller/ngless_vadim/user-data/
[Thu 03-12-2020 10:02:11]: 	user data directory: /home/puller/.local/share/ngless/data
[Thu 03-12-2020 10:02:11]: 	temporary directory: /home/puller/ngless_vadim/tmp/
[Thu 03-12-2020 10:02:11]: 	keep temporary files: False
[Thu 03-12-2020 10:02:11]: 	create report: True
[Thu 03-12-2020 10:02:11]: 	report directory: parallel.ngl.output_ngless
[Thu 03-12-2020 10:02:11]: 	color setting: AutoColor
[Thu 03-12-2020 10:02:11]: 	print header: True
[Thu 03-12-2020 10:02:11]: 	subsample: False
[Thu 03-12-2020 10:02:11]: 	verbosity: Normal
[Thu 03-12-2020 10:02:11]: 	search path:
[Thu 03-12-2020 10:02:11]: 		References=/data/ref/
[Thu 03-12-2020 10:02:11]: Loading modules...
[Thu 03-12-2020 10:02:11]: Validating script...
[Thu 03-12-2020 10:02:11]: Looking for file 'input.txt' (search path is ["References=/data/ref/"])
[Thu 03-12-2020 10:02:11]: Looking for file (input.txt) in input.txt
[Thu 03-12-2020 10:02:11]: Looking for file 'ref.fna' (search path is ["References=/data/ref/"])
[Thu 03-12-2020 10:02:11]: Looking for file (ref.fna) in ref.fna
[Thu 03-12-2020 10:02:11]: Writing to file 'output.tsv' will overwrite existing file.
[Thu 03-12-2020 10:02:11]: Writing to file 'compressed.tsv.gz' will overwrite existing file.
[Thu 03-12-2020 10:02:11]: Transforming script...
NGLess v1.2.0 (C) NGLess authors
https://ngless.embl.de/

When publishing results from this script, please cite the following references:

	 - Coelho, L.P., Alves, R., Monteiro, P., Huerta-Cepas, J., Freitas, A.T., and Bork, P.,
	 NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language. in
	 Microbiome 7:84 (2019). DOI: http://doi.org/10.1186/s40168-019-0684-8

	 - Li, H., 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv
	 preprint arXiv:1303.3997.


[Thu 03-12-2020 10:02:11]: Script OK. Starting interpretation...
[Thu 03-12-2020 10:02:11] Line 8: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 8: Interpreting [interpretIO]: __check_count(__VOID; original_lno=8; features=["seqname"])
[Thu 03-12-2020 10:02:11] Line 17: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 17: Interpreting [interpretIO]: __check_count(__VOID; original_lno=17; features=["seqname"])
[Thu 03-12-2020 10:02:11] Line 4: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 4: Interpreting [interpretIO]: allsamples = readlines("input.txt")
[Thu 03-12-2020 10:02:11] Line 4: Interpreting [assignment]: readlines("input.txt")
[Thu 03-12-2020 10:02:11] Line 4: Interpreting [executing module function: 'readlines']: NGOString "input.txt"
[Thu 03-12-2020 10:02:11] Line 5: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [interpretIO]: sample = lock1(Lookup 'allsamples' as NGList NGLString; __hash="29a4de8241eb49022b39cf4f32f0ee8c")
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [assignment]: lock1(Lookup 'allsamples' as NGList NGLString; __hash="29a4de8241eb49022b39cf4f32f0ee8c")
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [executing module function: 'lock1']: NGOList [NGOString "sample/sample.fq"]
[Thu 03-12-2020 10:02:11] Line 5: Looking for a lock in ngless-locks/29a4de82. Total number of elements is 1 (not locked: 1; not finished: 1).
[Thu 03-12-2020 10:02:11] Line 5: Acquired lock file ngless-locks/29a4de82/sample_sample.fq.lock
[Thu 03-12-2020 10:02:11] Line 5: lock1: Obtained lock file: 'ngless-locks/29a4de82/sample_sample.fq.lock'
[Thu 03-12-2020 10:02:11] Line 5: Writing stats to 'ngless-stats/29a4de82/sample_sample.fq'
[Thu 03-12-2020 10:02:11] Line 5: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [interpretIO]: __check_ifile(Lookup 'sample' as NGLString; original_lno=6)
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [executing module function: '__check_ifile']: NGOString "sample/sample.fq"
[Thu 03-12-2020 10:02:11] Line 6: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [interpretIO]: __check_ifile(Lookup 'sample' as NGLString; original_lno=6)
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [executing module function: '__check_ifile']: NGOString "sample/sample.fq"
[Thu 03-12-2020 10:02:11] Line 6: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [interpretIO]: input = fastq(Lookup 'sample' as NGLString)
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [assignment]: fastq(Lookup 'sample' as NGLString)
[Thu 03-12-2020 10:02:11] Line 6: Simple Statistics completed for: sample/sample.fq
[Thu 03-12-2020 10:02:11] Line 6: Number of base pairs: 584
[Thu 03-12-2020 10:02:11] Line 6: Encoding is: SangerEncoding
[Thu 03-12-2020 10:02:11] Line 6: Number of sequences: 2772
[Thu 03-12-2020 10:02:11] Line 7: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 7: Interpreting [interpretIO]: mapped = map(Lookup 'input' as NGLReadSet; fafile="ref.fna")
[Thu 03-12-2020 10:02:11] Line 7: Interpreting [assignment]: map(Lookup 'input' as NGLReadSet; fafile="ref.fna")
[Thu 03-12-2020 10:02:11] Line 7: Looking for file 'ref.fna' (search path is ["References=/data/ref/"])
[Thu 03-12-2020 10:02:11] Line 7: Looking for file (ref.fna) in ref.fna
[Thu 03-12-2020 10:02:11] Line 7: Index for ref.fna already exists.
[Thu 03-12-2020 10:02:11] Line 7: Created & opened temporary file /home/puller/ngless_vadim/tmp/mapped_ref.sam18674-2.zstd
[Thu 03-12-2020 10:02:11] Line 7: Starting mapping to ref.fna
[Thu 03-12-2020 10:02:11] Line 7: Will run process /opt/miniconda3/bin/../share/ngless/bin/ngless-1.2.0-bwa mem -t 1 -K 100000000 ref-bwa-0.7.17.fna -p -
[Thu 03-12-2020 10:02:12] Line 7: Stderr: [M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2772 sequences (643849 bp)...
[M::process] 2772 single-end sequences; 0 paired-end sequences
[M::mem_process_seqs] Processed 2772 reads in 0.301 CPU sec, 0.301 real sec
[main] Version: 0.7.17-r1188
[main] CMD: /opt/miniconda3/bin/../share/ngless/bin/ngless-1.2.0-bwa mem -t 1 -K 100000000 -p ref-bwa-0.7.17.fna -
[main] Real time: 0.343 sec; CPU: 0.321 sec

[Thu 03-12-2020 10:02:12] Line 7: Success
[Thu 03-12-2020 10:02:12] Line 7: Mapped readset stats (ref.fna):
[Thu 03-12-2020 10:02:12] Line 7: Total reads: 2772
[Thu 03-12-2020 10:02:12] Line 7: Total reads aligned: 551 [19.88%]
[Thu 03-12-2020 10:02:12] Line 7: Total reads Unique map: 545 [19.66%]
[Thu 03-12-2020 10:02:12] Line 7: Total reads Non-Unique map: 6 [0.22%]
[Thu 03-12-2020 10:02:12] Line 8: Running garbage collection.
[Thu 03-12-2020 10:02:12] Line 8: Interpreting [interpretIO]: counts = count(Lookup 'mapped' as NGLMappedReadSet; features=["seqname"])
[Thu 03-12-2020 10:02:12] Line 8: Interpreting [assignment]: count(Lookup 'mapped' as NGLMappedReadSet; features=["seqname"])
[Thu 03-12-2020 10:02:12] Line 8: Starting count...
[Thu 03-12-2020 10:02:12] Line 8: Loaded headers. Starting parsing/distribution.
[Thu 03-12-2020 10:02:12] Line 8: Counts (second pass)...
[Thu 03-12-2020 10:02:12] Line 8: Created & opened temporary file /home/puller/ngless_vadim/tmp/counts.mapped_ref18674-3.txt
[Thu 03-12-2020 10:02:12] Line 10: Running garbage collection.
[Thu 03-12-2020 10:02:12] Line 10: Interpreting [interpretIO]: collect(Lookup 'counts' as NGLCounts; __hash="8a1f8bb9f282748d0d48635fa5209289"; current=Lookup 'sample' as NGLString; allneeded=Lookup 'allsamples' as NGList NGLString; ofile="output.tsv"; auto_comments=[{script}])
[Thu 03-12-2020 10:02:12] Line 10: Interpreting [executing module function: 'collect']: NGOCounts File /home/puller/ngless_vadim/tmp/counts.mapped_ref18674-3.txt
[Thu 03-12-2020 10:02:12] Line 10: Created & opened temporary file /home/puller/ngless_vadim/tmp/partial.compress.tsv18674-5.gz
@vpuller vpuller changed the title Parallel model exits with error, if the data are not in the same directory as the script Parallel module exits with error, if the data are not in the same directory as the script Dec 3, 2020
@luispedro
Copy link
Member

Thanks for the report. I can confirm I see it and I will add a test case to the test suite capturing it

luispedro added a commit that referenced this issue Dec 4, 2020
This tests #141. From the original report:

> Parallel module exits with error, if the data are not in the same
> directory as the script. Running the parallel.ngl test exits with
> error, if data is moved into a subfolder.
@luispedro
Copy link
Member

Thanks for the bug report. This is fixed on the development version and will be included in the next release. In the meanwhile, I hope you can find a quick workaround using symlinks or similar

luispedro added a commit that referenced this issue Jan 25, 2021
An accummulation of improvements rather than a big new feature is what
triggers the new release.

Full `ChangeLog`:

- Better error message if the user attempts to use the non-existent <\> operator (suggest </>)
- Add min-ngless-version field for modules
- Add early check that block assignments are always to block variables
- Use ZStd compression for temporary files from preprocess()
- Correctly handle subpaths in samples for collect (fixes #141)
- Add to_string() to int and double types (partially fixes #78 & fixes #81)
- Add read_int() and read_double() functions (fixes #78)
luispedro added a commit that referenced this issue Jan 26, 2021
An accummulation of improvements rather than a big new feature is what
triggers the new release.

Full `ChangeLog`:

- Validate count() headers on --validate-only
- Better error message if the user attempts to use the non-existent <\> operator (suggest </>)
- Add min-ngless-version field for modules
- Add early check that block assignments are always to block variables
- Use ZStd compression for temporary files from preprocess()
- Correctly handle subpaths in samples for collect (fixes #141)
- Add to_string() to int and double types (partially fixes #78 & fixes #81)
- Add read_int() and read_double() functions (fixes #78)
luispedro added a commit that referenced this issue Jan 26, 2021
An accummulation of improvements rather than a big new feature is what
triggers the new release.

Full `ChangeLog`:

- Validate count() headers on --validate-only
- Better error message if the user attempts to use the non-existent <\> operator (suggest </>)
- Add min-ngless-version field for modules
- Add early check that block assignments are always to block variables
- Use ZStd compression for temporary files from preprocess()
- Correctly handle subpaths in samples for collect (fixes #141)
- Add to_string() to int and double types (partially fixes #78 & fixes #81)
- Add read_int() and read_double() functions (fixes #78)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants