Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WDLize GvsPrepareCallset (briefly known as CreateCohortTable) #7200

Merged
merged 30 commits into from
Apr 14, 2021

Conversation

mmorgantaylor
Copy link
Member

@mmorgantaylor mmorgantaylor commented Apr 13, 2021

specops issue #273: https://github.com/broadinstitute/dsp-spec-ops/issues/273

  • renamed ngs_cohort_extract.py -> create_cohort_extract_data_table.py
  • run the script in a WDL (GvsPrepareCallset.wdl)
  • use a custom docker - include script for creating and pushing this docker to gcr.io
  • enable running as a SA - this has been tested in Terra and works as expected. if using a dataset that requires SA access and the user does not provide a working SA key, they get this error: User does not have bigquery.jobs.create permission in project specops-variantstore-sa-tests.

.dockstore.yml Outdated
branches:
- master
- ah_var_store
- mmt_ngs_cohort_extract_wdl
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove before merging

@@ -68,7 +72,8 @@ task CreateCohortTableTask {
--destination_table ~{destination_cohort_table_name_final} \
--fq_cohort_sample_names ~{fq_cohort_sample_table_final} \
--query_project ~{query_project_final} \
--fq_sample_mapping_table ~{fq_sample_mapping_table_final}
--fq_sample_mapping_table ~{fq_sample_mapping_table_final} \
$SA_ARGS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

String? fq_cohort_sample_table
String? fq_sample_mapping_table

File? service_account_json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still defined as a File even if it's just a path to a json in GCP?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as Kris pointed out, since it's a file, it'll localize, so we can just pass it straight into python. tested!


if [ ~{has_service_account_file} = 'true' ]; then
SA_FILENAME="sa_key.json"
gsutil cp "~{service_account_json}" $SA_FILENAME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is declared as aFile above, won't it already be copied down to the VM now and the path be a local path?

@@ -0,0 +1,14 @@
if [ $# -lt 1 ]; then
echo "USAGE: ./build_docker.sh [DOCKER_TAG_STRING]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

scripts/variantstore/wdl/CreateCohortTable.wdl Outdated Show resolved Hide resolved
scripts/variantstore/wdl/CreateCohortTable.wdl Outdated Show resolved Hide resolved
scripts/variantstore/wdl/extract/Dockerfile Outdated Show resolved Hide resolved
scripts/variantstore/wdl/extract/run_gvs_tieout_extract.sh Outdated Show resolved Hide resolved
@mmorgantaylor mmorgantaylor changed the title WDLize CreateCohortTable WDLize GvsPrepareCallset (briefly known as CreateCohortTable) Apr 14, 2021
@mmorgantaylor mmorgantaylor merged commit 18d8477 into ah_var_store Apr 14, 2021
@mmorgantaylor mmorgantaylor deleted the mmt_ngs_cohort_extract_wdl branch April 14, 2021 18:46
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants