Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -m flag to gsutil step; add dockstore branch filters to facilitate development #7104

Merged
merged 12 commits into from
Feb 25, 2021

Conversation

mmorgantaylor
Copy link
Member

This PR addresses spec-ops issue #235 - Use -m flag in final gsutil mv of files in ImportGenomes.

Additionally, this PR adds branch filters to the dockstore.yml file that will help with development. The filter for each workflow indicates which branch(es) will show up for that workflow in dockstore. If we don't include these filters, dockstore will run checks of ALL workflows on ALL branches, which causes timeouts. We could remove these filters later (before merging to master) or not, but for now this could help us develop on ah_var_store. Note that we'll need to add feature branches to that file as we work on them.

This workflow was tested in Terra and the upload succeeded.

Also confirmed that if one file fails, the entire process throws an error code (i.e. -m flag will not cause failures to silently pass) - in example below, test_file_list.txt was a list of 6 files, including 1 file that did not exist.

➜  cat test_file_list.txt | gsutil cp -I gs://dsp-fieldeng-dev/test_cp/
Copying file://test1.txt [Content-Type=text/plain]...
Copying file://test2.txt [Content-Type=text/plain]...
Copying file://test3.txt [Content-Type=text/plain]...
CommandException: No URLs matched: test4.txt
➜  cat test_file_list.txt | gsutil -m cp -I gs://dsp-fieldeng-dev/test_cp/
If you experience problems with multiprocessing on MacOS, they might be related to https://bugs.python.org/issue33725. You can disable multiprocessing by editing your .boto config or by adding the following flag to your command: `-o "GSUtil:parallel_process_count=1"`. Note that multithreading is still available even if you disable multiprocessing.

CommandException: No URLs matched: test4.txt
Copying file://test1.txt [Content-Type=text/plain]...
Copying file://test5.txt [Content-Type=text/plain]...
Copying file://test2.txt [Content-Type=text/plain]...
Copying file://test3.txt [Content-Type=text/plain]...
Copying file://test6.txt [Content-Type=text/plain]...
- [5/5 files][   37.0 B/   37.0 B] 100% Done
Operation completed over 5 objects/37.0 B.
CommandException: 1 file/object could not be transferred.

Copy link
Contributor

@kcibul kcibul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment to address, but I need to know more about the dockstore thing before having a opinion on that. maybe at standup today?

gsutil cp pet_*.tsv ~{output_directory}/pet_tsvs/
gsutil cp vet_*.tsv ~{output_directory}/vet_tsvs/

gsutil -m cp metadata_*.tsv ~{output_directory}/metadata_tsvs/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you ever determine the change in behavior when adding the -m w.r.t exit codes?

For example, trying gsutil cp file1.txt filethatdoesntexist.txt gs://destination vs the same thing with -m. We can likely handle either case, but we should know (and maybe put a comment here about it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I addressed this in the PR comment! but I can add a comment to the code, sure

@kcibul
Copy link
Contributor

kcibul commented Feb 25, 2021 via email

@mmorgantaylor mmorgantaylor merged commit ac2f482 into ah_var_store Feb 25, 2021
@mmorgantaylor mmorgantaylor deleted the mmt_add_gsutil_m_IG branch February 25, 2021 18:41
kcibul pushed a commit that referenced this pull request Mar 9, 2021
…e development (#7104)

* add -m flag to gsutil cp, fix typo

* add to dockstore.yml

* add empty.json to dockstore.yml

* lowercase workflow name

* remove empty.json

* re-add and fix indents

* remove new wdl entirely from yml

* re-add to dockstore.yml file

* add real test perameter file

* add dockstore branch filters

* add comment to clarify error behavior of gsutil -m

* remove feature branch from dockstore.yml, remove gsutil comment
mmorgantaylor added a commit that referenced this pull request Apr 6, 2021
…e development (#7104)

* add -m flag to gsutil cp, fix typo

* add to dockstore.yml

* add empty.json to dockstore.yml

* lowercase workflow name

* remove empty.json

* re-add and fix indents

* remove new wdl entirely from yml

* re-add to dockstore.yml file

* add real test perameter file

* add dockstore branch filters

* add comment to clarify error behavior of gsutil -m

* remove feature branch from dockstore.yml, remove gsutil comment
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants