-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding the uber_monitor.py script #8268
Conversation
Included a unit test.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## ah_var_store #8268 +/- ##
================================================
Coverage ? 86.097%
Complexity ? 35609
================================================
Files ? 2197
Lines ? 167119
Branches ? 18006
================================================
Hits ? 143884
Misses ? 16800
Partials ? 6435 |
…rMonitor # Conflicts: # .dockstore.yml # scripts/variantstore/wdl/GvsCreateFilterSet.wdl
Run uber_monitor in GvsCreateFilterSet.wdl on all tasks.
Int disk_size = if (defined(split_intervals_disk_size_override)) then select_first([split_intervals_disk_size_override]) else 10 | ||
Int disk_memory = if (defined(split_intervals_mem_override)) then select_first([split_intervals_mem_override]) else 16 | ||
Int disk_size = select_first([split_intervals_disk_size_override, 10]) | ||
Int disk_memory = select_first([split_intervals_mem_override, 16]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor fix - noticed by miniwdl
@@ -137,6 +145,15 @@ workflow JointVcfFiltering { | |||
Array[File] indels_variant_scored_vcf_index = ScoreVariantAnnotationsINDELs.output_vcf_index | |||
Array[File] snps_variant_scored_vcf = ScoreVariantAnnotationsSNPs.output_vcf | |||
Array[File] snps_variant_scored_vcf_index = ScoreVariantAnnotationsSNPs.output_vcf_index | |||
Array[File?] monitoring_logs = flatten( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pondering if the pattern should be that any given workflow also call and output uber_monitor's summary file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand what this is asking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I have JointVcfFiltering itself call summarize_task_monitor_logs and generate its own report in addition to providing the outputs for the parent workflow to summarize the logs.
@@ -357,10 +358,44 @@ workflow GvsCreateFilterSet { | |||
} | |||
} | |||
|
|||
call Utils.UberMonitor as UberMonitorItAll { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will output a summary file for all tasks - used or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes to global
s requested. Also both the uber script and its test could use some PEP 8 love as I expect PEP 8 to be among the validations we'll run automatically in the new repo. 🙂 IntelliJ / PyCharm should provide PEP8 warnings by default.
global MaxCpu | ||
global MaxMem | ||
global MaxMemPct | ||
global MaxDisk | ||
global MaxDiskPct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these need to be declared global
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got rid of those
global MaxCpu | ||
MaxCpu = -100.0 | ||
global MaxMem | ||
MaxMem = -100.0 | ||
global MaxMemPct | ||
MaxMemPct = -100.0 | ||
global MaxDisk | ||
MaxDisk = -100.0 | ||
global MaxDiskPct | ||
MaxDiskPct = -100.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't the initializations happen where the variables are defined and this whole block deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They need to be initialized here since they are per used per monitoring log fle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit, but I would suggest renaming "uber_monitor.py" and "test_uber_monitor.py" to better reflect what the script does, e.g. "collate_task_monitor_logs.py".
Added pep8 fixes, renamed script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor issues perhaps best reviewed in mobbing
@@ -398,12 +432,14 @@ task ExtractFilterTask { | |||
} | |||
|
|||
String intervals_name = basename(intervals) | |||
|
|||
File monitoring_script = "gs://gvs_quickstart_storage/cromwell_monitoring_script.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bucket with quickstart
in its name might not be the best place for a script that's going to be used for non-quickstart runs. Maybe gs://gvs_internal
?
def parse_monitoring_log_file(mlog_file, output): | ||
eprint(f"Parsing: {mlog_file}") | ||
|
||
if (os.stat(mlog_file).st_size == 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay for the PEP 8 fixups, but I'm still seeing a lot of non-PEP 8 warnings in IntelliJ. e.g. on this line "Remove redundant parantheses". If/when we go to our own repo there will likely be Python linting that will error on issues like this. Happy to review in mobbing to make sure we're seeing the same thing!
scripts/variantstore/wdl/extract/test_summarize_task_monitor_logs.py
Outdated
Show resolved
Hide resolved
@@ -137,6 +145,15 @@ workflow JointVcfFiltering { | |||
Array[File] indels_variant_scored_vcf_index = ScoreVariantAnnotationsINDELs.output_vcf_index | |||
Array[File] snps_variant_scored_vcf = ScoreVariantAnnotationsSNPs.output_vcf | |||
Array[File] snps_variant_scored_vcf_index = ScoreVariantAnnotationsSNPs.output_vcf_index | |||
Array[File?] monitoring_logs = flatten( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand what this is asking
Okay, I think I've got most of it. Still want to move the monitoring script somewhere better. |
There are still 17 (!) references to the script in its previous location. Is it possible to bring that number down? |
Passing integration test here |
Adding the uber_monitor.py script to GvsUtils.wdl
Threaded it into GvsCreateFilterSet
Included a unit test.