You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On a macOS test box (Mojave, 10.14.6, running Python 3.8.1), when parallel data processing is enabled by -parallel n option (in which n > 0), most of the log messages generated by child processes disappeared from the main console.
For example, when running the test case in examples/gwas/gwas.yml, here is the output generated by cimr processor -process -yaml-file gwas.yml -chunksize 500 -parallel 4:
[2020-02-24 17:00:03 root:INFO ] processing .../cimr/examples/gwas/gwas.yml
[2020-02-24 17:00:03 root:INFO ] file exists in submitted_data/gwas/
[2020-02-24 17:00:03 root:INFO ] md5sum verified.
[2020-02-24 17:00:03 root:INFO ] following columns will be renamed:
[2020-02-24 17:00:03 root:INFO ] dict_keys(['panel_variant_id', 'chromosome', 'position', 'variant_id', 'effect_allele', 'non_effect_allele', 'effect_size', 'standard_error', 'zscore', 'pvalue', 'imputation_status'])
[2020-02-24 17:00:03 root:INFO ] making dir: processed_data/gwas/
[2020-02-24 17:00:03 root:INFO ] found submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:00:03 root:INFO ] loading submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:00:03 root:INFO ] chunksize: 0.0005 million
[2020-02-24 17:00:03 root:INFO ] number of parallel processes: 4
WARNING:root: chunk #2: chromosome id needs to be checked.
WARNING:root: chunk #1: chromosome id needs to be checked.
WARNING:root: chunk #3: chromosome id needs to be checked.
WARNING:root: chunk #4: chromosome id needs to be checked.
WARNING:root: chunk #5: chromosome id needs to be checked.
[2020-02-24 17:00:04 root:INFO ] combine chunk output files ...
[2020-02-24 17:00:04 root:INFO ] finished processing submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:00:04 root:INFO ] output has been saved as processed_data/gwas/gwas.txt.gz.
[2020-02-24 17:00:04 root:INFO ] ------------------------------------------------------------
[2020-02-24 17:00:04 root:INFO ] reading a local catalog file catalog.txt.
[2020-02-24 17:00:04 root:INFO ] context_id is not provided.
[2020-02-24 17:00:04 root:INFO ] updating catalog.txt for submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:00:04 root:INFO ] ------------------------------------------------------------
Only five lines of log messages from child processes were shown above:
WARNING:root: chunk #2: chromosome id needs to be checked.
WARNING:root: chunk #1: chromosome id needs to be checked.
WARNING:root: chunk #3: chromosome id needs to be checked.
WARNING:root: chunk #4: chromosome id needs to be checked.
WARNING:root: chunk #5: chromosome id needs to be checked.
But if we change the command to non-parallel mode: cimr processor -process -yaml-file gwas.yml -chunksize 500 -parallel 0
then the output log messages became:
[2020-02-24 17:02:40 root:INFO ] processing .../cimr/examples/gwas/gwas.yml
[2020-02-24 17:02:40 root:INFO ] file exists in submitted_data/gwas/
[2020-02-24 17:02:40 root:INFO ] md5sum verified.
[2020-02-24 17:02:40 root:INFO ] following columns will be renamed:
[2020-02-24 17:02:40 root:INFO ] dict_keys(['panel_variant_id', 'chromosome', 'position', 'variant_id', 'effect_allele', 'non_effect_allele', 'effect_size', 'standard_error', 'zscore', 'pvalue', 'imputation_status'])
[2020-02-24 17:02:40 root:INFO ] making dir: processed_data/gwas/
[2020-02-24 17:02:40 root:INFO ] found submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:02:40 root:INFO ] loading submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:02:40 root:INFO ] chunksize: 0.0005 million
[2020-02-24 17:02:40 root:INFO ] number of parallel processes: 0
[2020-02-24 17:02:40 root:INFO ] chunk #1: start processing ...
[2020-02-24 17:02:40 root:INFO ] chunk #1: renaming columns based on column dictionary.
[2020-02-24 17:02:40 root:INFO ] chunk #1: chromosome(s) included: ['chr1' 'chr10' 'chr11' 'chr12' 'chr13']
[2020-02-24 17:02:40 root:WARNING ] chunk #1: chromosome id needs to be checked.
[2020-02-24 17:02:40 root:INFO ] chunk #1: finished checking chromosome ids.
[2020-02-24 17:02:40 root:INFO ] chunk #1: variant_id has been standardized.
[2020-02-24 17:02:40 root:INFO ] variant counts before filtering palindromic variants: 500
[2020-02-24 17:02:40 root:INFO ] variant counts after filtering palindromic variants: 426
[2020-02-24 17:02:40 root:INFO ] data contains z-score column.
[2020-02-24 17:02:40 root:INFO ] pvalue is numeric.
[2020-02-24 17:02:40 root:INFO ] zscore is numeric.
[2020-02-24 17:02:40 root:INFO ] effect_size is numeric.
[2020-02-24 17:02:40 root:INFO ] standard_error is numeric.
[2020-02-24 17:02:40 root:INFO ] pvalue only contains values between 0 and 1.
[2020-02-24 17:02:40 root:INFO ] chunk #1: dropping duplicate columns.
[2020-02-24 17:02:40 root:INFO ] chunk #1: reordering processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #1: writing processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #1: ---------------------- DONE ----------------------
[2020-02-24 17:02:40 root:INFO ] chunk #2: start processing ...
[2020-02-24 17:02:40 root:INFO ] chunk #2: renaming columns based on column dictionary.
[2020-02-24 17:02:40 root:INFO ] chunk #2: chromosome(s) included: ['chr14' 'chr15' 'chr16' 'chr17' 'chr18']
[2020-02-24 17:02:40 root:WARNING ] chunk #2: chromosome id needs to be checked.
[2020-02-24 17:02:40 root:INFO ] chunk #2: finished checking chromosome ids.
[2020-02-24 17:02:40 root:INFO ] chunk #2: variant_id has been standardized.
[2020-02-24 17:02:40 root:INFO ] variant counts before filtering palindromic variants: 500
[2020-02-24 17:02:40 root:INFO ] variant counts after filtering palindromic variants: 434
[2020-02-24 17:02:40 root:INFO ] data contains z-score column.
[2020-02-24 17:02:40 root:INFO ] pvalue is numeric.
[2020-02-24 17:02:40 root:INFO ] zscore is numeric.
[2020-02-24 17:02:40 root:INFO ] effect_size is numeric.
[2020-02-24 17:02:40 root:INFO ] standard_error is numeric.
[2020-02-24 17:02:40 root:INFO ] pvalue only contains values between 0 and 1.
[2020-02-24 17:02:40 root:INFO ] chunk #2: dropping duplicate columns.
[2020-02-24 17:02:40 root:INFO ] chunk #2: reordering processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #2: writing processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #2: ---------------------- DONE ----------------------
[2020-02-24 17:02:40 root:INFO ] chunk #3: start processing ...
[2020-02-24 17:02:40 root:INFO ] chunk #3: renaming columns based on column dictionary.
[2020-02-24 17:02:40 root:INFO ] chunk #3: chromosome(s) included: ['chr19' 'chr2' 'chr20' 'chr21' 'chr22']
[2020-02-24 17:02:40 root:WARNING ] chunk #3: chromosome id needs to be checked.
[2020-02-24 17:02:40 root:INFO ] chunk #3: finished checking chromosome ids.
[2020-02-24 17:02:40 root:INFO ] chunk #3: variant_id has been standardized.
[2020-02-24 17:02:40 root:INFO ] variant counts before filtering palindromic variants: 500
[2020-02-24 17:02:40 root:INFO ] variant counts after filtering palindromic variants: 441
[2020-02-24 17:02:40 root:INFO ] data contains z-score column.
[2020-02-24 17:02:40 root:INFO ] pvalue is numeric.
[2020-02-24 17:02:40 root:INFO ] zscore is numeric.
[2020-02-24 17:02:40 root:INFO ] effect_size is numeric.
[2020-02-24 17:02:40 root:INFO ] standard_error is numeric.
[2020-02-24 17:02:40 root:INFO ] pvalue only contains values between 0 and 1.
[2020-02-24 17:02:40 root:INFO ] chunk #3: dropping duplicate columns.
[2020-02-24 17:02:40 root:INFO ] chunk #3: reordering processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #3: writing processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #3: ---------------------- DONE ----------------------
[2020-02-24 17:02:40 root:INFO ] chunk #4: start processing ...
[2020-02-24 17:02:40 root:INFO ] chunk #4: renaming columns based on column dictionary.
[2020-02-24 17:02:40 root:INFO ] chunk #4: chromosome(s) included: ['chr3' 'chr4' 'chr5' 'chr6' 'chr7']
[2020-02-24 17:02:40 root:WARNING ] chunk #4: chromosome id needs to be checked.
[2020-02-24 17:02:40 root:INFO ] chunk #4: finished checking chromosome ids.
[2020-02-24 17:02:40 root:INFO ] chunk #4: variant_id has been standardized.
[2020-02-24 17:02:40 root:INFO ] variant counts before filtering palindromic variants: 500
[2020-02-24 17:02:40 root:INFO ] variant counts after filtering palindromic variants: 428
[2020-02-24 17:02:40 root:INFO ] data contains z-score column.
[2020-02-24 17:02:40 root:INFO ] pvalue is numeric.
[2020-02-24 17:02:40 root:INFO ] zscore is numeric.
[2020-02-24 17:02:40 root:INFO ] effect_size is numeric.
[2020-02-24 17:02:40 root:INFO ] standard_error is numeric.
[2020-02-24 17:02:40 root:INFO ] pvalue only contains values between 0 and 1.
[2020-02-24 17:02:40 root:INFO ] chunk #4: dropping duplicate columns.
[2020-02-24 17:02:40 root:INFO ] chunk #4: reordering processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #4: writing processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #4: ---------------------- DONE ----------------------
[2020-02-24 17:02:40 root:INFO ] chunk #5: start processing ...
[2020-02-24 17:02:40 root:INFO ] chunk #5: renaming columns based on column dictionary.
[2020-02-24 17:02:40 root:INFO ] chunk #5: chromosome(s) included: ['chr8' 'chr9']
[2020-02-24 17:02:40 root:WARNING ] chunk #5: chromosome id needs to be checked.
[2020-02-24 17:02:40 root:INFO ] chunk #5: finished checking chromosome ids.
[2020-02-24 17:02:40 root:INFO ] chunk #5: variant_id has been standardized.
[2020-02-24 17:02:40 root:INFO ] variant counts before filtering palindromic variants: 200
[2020-02-24 17:02:40 root:INFO ] variant counts after filtering palindromic variants: 167
[2020-02-24 17:02:40 root:INFO ] data contains z-score column.
[2020-02-24 17:02:40 root:INFO ] pvalue is numeric.
[2020-02-24 17:02:40 root:INFO ] zscore is numeric.
[2020-02-24 17:02:40 root:INFO ] effect_size is numeric.
[2020-02-24 17:02:40 root:INFO ] standard_error is numeric.
[2020-02-24 17:02:40 root:INFO ] pvalue only contains values between 0 and 1.
[2020-02-24 17:02:40 root:INFO ] chunk #5: dropping duplicate columns.
[2020-02-24 17:02:40 root:INFO ] chunk #5: reordering processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #5: writing processed data.
[2020-02-24 17:02:40 root:INFO ] chunk #5: ---------------------- DONE ----------------------
[2020-02-24 17:02:40 root:INFO ] finished processing submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:02:40 root:INFO ] output has been saved as processed_data/gwas/gwas.txt.gz.
[2020-02-24 17:02:40 root:INFO ] ------------------------------------------------------------
[2020-02-24 17:02:40 root:INFO ] reading a local catalog file catalog.txt.
[2020-02-24 17:02:40 root:INFO ] context_id is not provided.
[2020-02-24 17:02:40 root:INFO ] updating catalog.txt for submitted_data/gwas/gwas.txt.gz.
[2020-02-24 17:02:40 root:INFO ] ------------------------------------------------------------
dongbohu
changed the title
Log messages garbled on macOS in parallel mode
macOS in parallel mode: log messages in child processes swallowed
Feb 25, 2020
On a macOS test box (Mojave, 10.14.6, running Python 3.8.1), when parallel data processing is enabled by
-parallel n
option (in whichn
> 0), most of the log messages generated by child processes disappeared from the main console.For example, when running the test case in
examples/gwas/gwas.yml
, here is the output generated bycimr processor -process -yaml-file gwas.yml -chunksize 500 -parallel 4
:Only five lines of log messages from child processes were shown above:
But if we change the command to non-parallel mode:
cimr processor -process -yaml-file gwas.yml -chunksize 500 -parallel 0
then the output log messages became:
All lines from:
until:
were generated by child processes.
This issue doesn't happen on Linux boxes (Ubuntu and Debian, Python 3.6.x, 3,7,x and 3.8.1).
The text was updated successfully, but these errors were encountered: