Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VS_1327 ensure sample ids are unique #8818

Merged
merged 9 commits into from
May 8, 2024

Conversation

gbggrant
Copy link
Collaborator

@gbggrant gbggrant commented May 7, 2024

This PR adds a task to GvsAssignIds to verify that there are no duplicate sample names in the file provided.

Here is an example run of BulkIngest that replicates the original reported problem. No sample set provided, the sample id column is not sample_id and there's a duplicate in THAT column.
Here is an example run where the updated code runs and reports the problem early-ish without creating database tables that need to be cleaned up.
Here is a normal run that passes (same basic idea as the initial problem, except that I removed the duplicate row from the samples table.

Here is a passing integration test.

@gbggrant gbggrant marked this pull request as ready for review May 8, 2024 01:46
@gbggrant gbggrant requested review from mcovarr and rsasch May 8, 2024 01:50
Comment on lines 6 to 7
# A comment is here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# A comment is here.

Copy link

@koncheto-broad koncheto-broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Glad we're catching it early before we do any BQ stuff

@gbggrant gbggrant merged commit df03bd7 into ah_var_store May 8, 2024
13 checks passed
@gbggrant gbggrant deleted the gg_VS-1327_EnsureSampleIdsAreUnique branch May 8, 2024 14:07
gbggrant added a commit that referenced this pull request May 8, 2024
* Have GvsAssignIds.wdl validate that input sample names (in the provided input file) are unique.
gbggrant added a commit that referenced this pull request May 8, 2024
* VS_1327 ensure sample ids are unique (#8818)
* Have GvsAssignIds.wdl validate that input sample names (in the provided input file) are unique.
RoriCremer pushed a commit that referenced this pull request Jun 10, 2024
* Have GvsAssignIds.wdl validate that input sample names (in the provided input file) are unique.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants