Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: generate-table-partitions- fixes Issue 945 and Issue 950 #962

Conversation

sundar-mudupalli-work
Copy link
Contributor

fix generate-table-partitions: Fixes Issue 945 and Issue 950
When generating partition filters, fixes string literals that might have ' and includes filters provided in the data-validation command line.

Piyush Sarraf and others added 16 commits August 17, 2023 10:05
…rlier

Added functionality to support Kubernetes Indexed jobs - which when provided with a directory will only run the job corresponding to the index.
Tested in a non Kubernetes setup
…bis to turn table expressions into SQL statements.

This addresses bugs #945 and #950. Unfortunately, we depend on the version of sqlalchemy being 2.0 or later which has fixed
a problem with datetime being rendered by compile - see
https://docs.sqlalchemy.org/en/20/changelog/changelog_20.html#change-206ec1f2af3a0c93785758c723ba356f
…D and test cases. Need to check that everything works.
@sundar-mudupalli-work sundar-mudupalli-work requested a review from a team as a code owner August 30, 2023 04:50
@sundar-mudupalli-work
Copy link
Contributor Author

/gcbrun

@sundar-mudupalli-work sundar-mudupalli-work changed the title Update generate-table-partition logic to work with filters and strings with quotes fix generate-table-partitions: Fixes Issue 945 and Issue 950 Aug 30, 2023
@sundar-mudupalli-work sundar-mudupalli-work changed the title fix generate-table-partitions: Fixes Issue 945 and Issue 950 fix: generate-table-partitions- fixes Issue 945 and Issue 950 Aug 30, 2023
@sundar-mudupalli-work
Copy link
Contributor Author

/gcbrun

@sundar-mudupalli-work
Copy link
Contributor Author

/gcbrun

@sundar-mudupalli-work
Copy link
Contributor Author

/gcbrun

@sundar-mudupalli-work
Copy link
Contributor Author

/gcbrun

@sundar-mudupalli-work
Copy link
Contributor Author

/gcbrun

@sundar-mudupalli-work
Copy link
Contributor Author

/gcbrun

@helensilva14 helensilva14 added the priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. label Aug 31, 2023
```
The internal select statement adds the partition number to each row in the table and the external select statement gets the value of the primary keys for the first row.
### How to generate the where clauses
Once we have the first row of each partition, we have to generate the where clauses for each partition in the source and target tables. The best way may be to generate the ibis table expression including the provided filter clause and the additional filter clause from the first rows we have calculated. We can then have _ibis_ `to_sql` convert the table expression into plain text, extract the where clause and use that. _ibis_ depends on _sqlalchemy_, which has a bug in that it does not support rendering date and timestamps by `to_sql` for versions of _sqlalchemy_ prior to 2.0. Until. we migrate to using _sqlalchemy_ 2.0, we may not be able to support dates and timestamps as a primary key column.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo here "Until. we migrate to using sqlalchemy 2.0"

@sundar-mudupalli-work sundar-mudupalli-work merged commit c53f2fc into develop Aug 31, 2023
5 checks passed
@sundar-mudupalli-work sundar-mudupalli-work deleted the 950-filters-are-not-applied-before-generating-partitions branch August 31, 2023 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p0 Highest priority. Critical issue. Will be fixed prior to next release.
Projects
None yet
3 participants