Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DVT output is not proper and giving ambiguous output with --filters, --use-random-row and --random-row-batch-size options #552

Closed
kanhaPrayas opened this issue Aug 5, 2022 · 2 comments · Fixed by #582
Assignees
Labels
priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@kanhaPrayas
Copy link
Contributor

We are running DVT with filters, --random-row-batch-size and --use-random-row(for sampling) options. The source and target tables both are Teradata. The DVT execution is going thru properly. However the output generated is not correct and is ambiguous.
We are trying to run the DVT with 10,000 batch size, but the output being generated is only for very handful rows(3/5/7).

@nehanene15
Copy link
Collaborator

Can you provide an example command and an example of the output generated? Does this occur for both text and BQ result handler? Is the query generated correct?

@nehanene15 nehanene15 added the priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. label Sep 1, 2022
@nehanene15 nehanene15 self-assigned this Sep 1, 2022
@nehanene15 nehanene15 added the type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. label Sep 1, 2022
@nehanene15
Copy link
Collaborator

The issue here is that DVT currently treat filters and random row mutually exclusively, when in fact we need to apply the filter inside the random row. I.e. when getting the X IDs randomly, we need to apply the filter 'timestamp > Y'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants