-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Determine the method for generating batches efficiently #619
Labels
priority: p1
High priority. Fix may be included in the next release.
Comments
TODO:
Later:
|
mohdt786
added a commit
that referenced
this issue
Jan 25, 2023
…653) (Issue #619,#662) Features: 1. New command 'generate-table-partitions' added to generate partitions for `row` type validation 2. --partition-num: Number of partitions/config files to create. Range=[1,1000] If specified value is greater than count(*), value if coalesced to count(*) 3. --config-dir: Directory Path to store YAML Config Files. Either local or GCS path can be supplied 5. Added required arguments group to distinguish from optional arguments 6. Added mutually exclusive arguments group for --hash and --concat 7. --partition-key: Column on which the partitions would be generated. Column type must be integer. Defaults to Primary key Tests: Added unit tests for partition_builder.py, provides coverage for partition_row_builder.py README.md & examples.md: 1. Added description for usage of 'generate-table-partitions' command 2. Added examples
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sub-issue of issue #598
The fastest solution will be to use a MIN/MAX split of a numeric primary key and compute the batches external to the database in order to generate config files for efficient partitioned execution.
Longer term we may need to look into more complex solutions such as RANK_ORDER and MOD division in order to generate batches across combo analytical keys.
Exit criteria for this ticket should be the generation of multiple YAML files that contain partitioned filters.
The text was updated successfully, but these errors were encountered: