Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Row validation failing for TD DECIMAL(15,2) and BQ NUMERIC(15,2) #1134

Closed
franklinWhaite opened this issue May 15, 2024 · 3 comments · Fixed by #1138
Closed

Row validation failing for TD DECIMAL(15,2) and BQ NUMERIC(15,2) #1134

franklinWhaite opened this issue May 15, 2024 · 3 comments · Fixed by #1138
Assignees
Labels
priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. type: feature request 'Nice-to-have' improvement, new feature or different behavior or design.

Comments

@franklinWhaite
Copy link
Member

We have noticed that TD and BQ follow different approaches to store DECIMAL types and also yield different results when casting decimal types to string.

Example 1

  • TD stores 44.00 as 44.00
  • BQ stores 44.00 as 44

Example 2

  • In TD CAST(0.07 AS STRING) -> .07
  • In BQ CAST(0.07 AS STRING) -> 0.07

Workaround

We have been using a custom SQL with the following manipulations

Source custom SQL (TD)

The following will remove trailing ceros, e.g. 44.00 -> 44
TRIM(TRAILING '.' FROM TRIM(TRAILING '0' FROM CAST(QUANTITY AS VARCHAR(15)))) AS QUANTITY

Target custom SQL (BQ)

The following will remove leading ceros before the decimal separator, e.g. 0.07 -> .07
REGEXP_REPLACE(CAST(QUANTITY AS STRING),r'^0+','') as QUANTITY

Asks

1. New feature

Can DVT account for decimal to string casting inconsistencies during row validation? It's not ideal to create custom SQL for each table with decimals as this requires significant manual work. Resolving this issue would benefit future users by preventing them from encountering these inconsistencies and needing to create custom SQL solutions.

2. Workaround

Is there any other workaround less cumbersome than having to create custom sql for each table with decimals?

@nehanene15 nehanene15 added priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 15, 2024
@nj1973
Copy link
Contributor

nj1973 commented May 16, 2024

Research

Example 1 was perplexing me because we have a test for this already. After a bit of experimentation I figured out that NUMBER and DECIMAL in Teradata behave differently, notice the outputs below:

SELECT CAST(CAST(100 AS DECIMAL(5,2)) AS VARCHAR(10));

100
----------
100.00

SELECT CAST(CAST(100 AS NUMBER(5,2)) AS VARCHAR(10));

100
----------
100

We have the same DECIMAL issue on a couple of other engines and resolve it using TO_CHAR:

SELECT TO_CHAR(CAST(100 AS DECIMAL(5,2)),'TM9');

TO_CHAR(100,'TM9')
-----------------------------------------------------------------
100

We should be able to fix this one fairly easily.

@nj1973
Copy link
Contributor

nj1973 commented May 16, 2024

Research

Whatever we decide to do with example 2 we should not forget negative numbers:

SELECT TO_CHAR(CAST(-0.07 AS DECIMAL(5,2)),'TM9');

TO_CHAR(-0.07,'TM9')
-----------------------------------------------------------------
-.07

@nj1973
Copy link
Contributor

nj1973 commented May 16, 2024

Research

For example 2, if we know the precision and scale for a column we could inject a FORMAT clause as described in this SO note: https://stackoverflow.com/a/30327287

If we do not know both the precision and scale then we probably need to resort to string manipulation or regexp.

@helensilva14 helensilva14 added type: feature request 'Nice-to-have' improvement, new feature or different behavior or design. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 16, 2024
@nj1973 nj1973 self-assigned this May 17, 2024
@nj1973 nj1973 linked a pull request May 20, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. type: feature request 'Nice-to-have' improvement, new feature or different behavior or design.
Projects
Status: Done 🚀
4 participants