Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: custom query row validation failing when SQL contains upper cased columns #994

Merged
merged 4 commits into from
Sep 20, 2023

Conversation

nj1973
Copy link
Contributor

@nj1973 nj1973 commented Sep 19, 2023

We found this issue on Oracle where column default to upper case but it could easily be reproduced on other engines by writing SQL with upper cased columns, the columns in the actual SQL engine were irrelevant, it was just the SQL we pass in.

The problem was only seen for custom query validation using --comparison-fields. When using --hash the calculated fields resulted in lower case end column names.

I added tests to protect from regressions for Oracle, SQL Server, Teradata vs BigQuery. The fix was global so should wortk on all engines but I was reluctant to add too many tests seeing as integration tests are already pretty slow.

The fix was very minor, in data_validation/clients.py I convert the Ibis schema to have lower case columns:

-    return client.sql(query)
+    iq = client.sql(query)
+    # Normalise all columns in the query to lower case.
+    # https://github.com/GoogleCloudPlatform/professional-services-data-validator/issues/992
+    iq = iq.relabel(dict(zip(iq.columns, [_.lower() for _ in iq.columns])))
+    return iq

…atch table equivalent which come back from Ibis in lower case
…ion to match table equivalent which come back from Ibis in lower case"

This reverts commit 963b370.
…quivalent which comes back from Ibis in lower case
@nj1973 nj1973 requested a review from a team as a code owner September 19, 2023 17:15
@nj1973
Copy link
Contributor Author

nj1973 commented Sep 19, 2023

/gcbrun

Copy link
Collaborator

@nehanene15 nehanene15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@nj1973 nj1973 merged commit a9fed41 into develop Sep 20, 2023
5 checks passed
@nj1973 nj1973 deleted the fix/custom-query-row-column-case branch September 20, 2023 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

custom query row validation has inconsistent column case to table validation
2 participants