generate-table-partitions between Teradata and BQ does not work with uppercase primary key names #1136

sundar-mudupalli-work · 2024-05-16T00:48:28Z

Hi,

It appears that generate-table-partitions does not work with primary key names in Teradata that are in upper case. Teradata's column names are not case sensitive.

The DDL that was used was

CREATE SET TABLE udfs.issue1136, FALLBACK (
    N_NATIONKEY INTEGER NOT NULL,
    N_NAME CHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL,
    N_REGIONKEY INTEGER NOT NULL,
    N_COMMENT CHAR(152) CHARACTER SET LATIN CASESPECIFIC NOT NULL )
UNIQUE PRIMARY INDEX ( N_NATIONKEY);

When we try the following command, it works fine data-validation generate-table-partitions -sc teradata -tc teradata -tbls=udfs.issue1136 -pk=N_NATIONKEY -hash '*' -pn 3 -cdir .

The real purpose is to compare two tables from different databases - so we try to compare teradata with BigQuery table defined as follows:

CREATE TABLE pso_data_validator.issue1136 (
  n_nationkey INT64,
  n_name STRING,
  n_regionkey INT64,
  n_comment STRING
);

If we use the command data-validation generate-table-partitions -sc teradata -tc bq -tbls=udfs.issue1136=pso_data_validator.issue1136 -pk=N_NATIONKEY -hash '*' -pn 3 -cdir . you get the error:

    target_count = target_partition_row_builder.get_count()
  File "/home/user/professional-services-data-validator/env/lib/python3.10/site-packages/data_validation/query_builder/partition_row_builder.py", line 65, in get_count
    return self.query[self.primary_keys].count().execute()
...
ibis.common.exceptions.IbisTypeError: Column 'N_NATIONKEY' is not found in table. Existing columns: 'n_nationkey', 'n_name', 'n_regionkey', 'n_comment'.

However if you invert the case and try data-validation generate-table-partitions -sc teradata -tc bq -tbls=udfs.issue1136=pso_data_validator.issue1136 -pk=n_nationkey -hash '*' -pn 3 -cdir . you get the error:

    source_count = source_partition_row_builder.get_count()
  File "/home/user/professional-services-data-validator/env/lib/python3.10/site-packages/data_validation/query_builder/partition_row_builder.py", line 65, in get_count
    return self.query[self.primary_keys].count().execute()
....
ibis.common.exceptions.IbisTypeError: Column 'n_nationkey' is not found in table. Existing columns: 'N_NATIONKEY', 'N_NAME', 'N_REGIONKEY', 'N_COMMENT'.

In the first case, ibis is not able to execute the count operation since the BQ column names are all in lower case. In the second case, ibis is not able to execute the count operation since the Teradata column names are all in upper case. So either case you use - it does not work. We try to get the count to make sure the table sizes are within 10% of each other - if not the partition split is not going to be very even.

I checked with a table defined with lower case column names in Oracle and it return ibis column names in lower case. A bit more investigation is needed.

Suggestions on how to fix ?

The text was updated successfully, but these errors were encountered:

sundar-mudupalli-work changed the title ~~generate-table-partitions does not work with uppercase primary key names~~ generate-table-partitions between Teradata and BQ does not work with uppercase primary key names May 16, 2024

helensilva14 added priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 16, 2024

nehanene15 self-assigned this May 22, 2024

nehanene15 mentioned this issue May 23, 2024

fix: Support PKs with different casing for generate-partitions #1142

Merged

nehanene15 closed this as completed in #1142 May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generate-table-partitions between Teradata and BQ does not work with uppercase primary key names #1136

generate-table-partitions between Teradata and BQ does not work with uppercase primary key names #1136

sundar-mudupalli-work commented May 16, 2024 •

edited

Loading

generate-table-partitions between Teradata and BQ does not work with uppercase primary key names #1136

generate-table-partitions between Teradata and BQ does not work with uppercase primary key names #1136

Comments

sundar-mudupalli-work commented May 16, 2024 • edited Loading

sundar-mudupalli-work commented May 16, 2024 •

edited

Loading