Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Ibis Upgrade to 5.1.0 #894

Merged
merged 52 commits into from
Jul 18, 2023
Merged

feat!: Ibis Upgrade to 5.1.0 #894

merged 52 commits into from
Jul 18, 2023

Conversation

nehanene15
Copy link
Collaborator

@nehanene15 nehanene15 commented Jul 5, 2023

Closes #317 (Ibis upgrade)
Closes #839 (BQ BIGNUMERIC is now expressed as decimal(76, 38))
Closes #767 (Spanner timestamp exception)
Closes #833 (MSSQL support for sum on datetime columns)
Closes #832 (MSSQL support for aggregating varchar/char columns)
Closes #719 (MSSQL/BQ column validation support for float data types)
Closes #821 (DB2 custom query index out of range bug fix)
Closes #814 (Python 3.11 support)

Breaking Change: Python 3.7 is not supported anymore. This PR also updates most of the required Python packages.

Summary: Updates Ibis to the newest 5.1.0 release. It touches a lot of files, but mainly the backend third_party connection implementations.

Notes:

  • Changed type of labels from array of structs to array of array since the Ibis upgrade interprets tuples as arrays. I.e Previously, a label would be [(“name”, “Neha”)] and now it will be [[“name”, “Neha”]] (This doesn’t affect BQ schema or outputs to the BQ result table)
  • Due to above, I had to update tests/unit/test_combiner.py to remove labels from test cases since pandas assert frame equal can’t determine equality of an array
  • Snowflake support is in Alpha (has not been tested)
  • The Ibis upgrade will continue to support MySQL versions < 8.0 as I overrode the new functionality that required CTEs in third_party/ibis/ibis_mysql

User-focused changes:

  • For schema validation, previously we expressed non-nullable fields as string[non-nullable]. Post upgrade it will be expressed as !string

…ird party in favor of Ibis snowflake connector
…ata-validation -v configs run -c config.yaml for upgrade, clean up comments
@nj1973
Copy link
Contributor

nj1973 commented Jul 12, 2023

/gcbrun

…upport, support binary_float datatype for Oracle custom query
Copy link
Contributor

@nj1973 nj1973 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, awesome job.

@piyushsarraf
Copy link
Contributor

piyushsarraf commented Jul 13, 2023

Hi
I was testing the ibis-upgrade branch and saw some errors wrt to BQ and RedShift.
I was doing a BQ - BQ row validation and got a error of 2 packages found for backend 'bigquery'
image
And for RedShift I was doing a row validation and the error seems to be with version of SQL Alchemy.
image

@nj1973
Copy link
Contributor

nj1973 commented Jul 13, 2023

@piyushsarraf , I also saw the "2 packages found for backend 'bigquery'" message. It was because my virtual environment was not clean. I fixed it by removing the venv, creating it fresh and then pip installing DVT.

@nehanene15
Copy link
Collaborator Author

@piyushsarraf
As of now the only blocker to merge the PR is the Redshift unrecognized configuration parameter. It could be because I upgraded the psycopg2-binary package from 2.9.3 to 2.9.6, or due to SQLAlchemy conflicts similar to this - https://stackoverflow.com/questions/76532906/unrecognized-configuration-parameter-standard-conforming-strings-while-queryin

@Raniksingh
Copy link
Contributor

@nehanene15 I just verified the below version which works for Redshift.
psycopg2-binary ==2.9.6 and SQLAlchemy ==1.4.39

@sundar-mudupalli-work
Copy link
Contributor

Hey Neha,
I was excited that Teradata query worked with DVT, and looked into getting generate-partitions to work on Teradata. Good news - I have it working and tested on oracle, sqlserver, bigquery, mysql, postgres, hive and teradata. Test functions for these have been written as well. Not so Good news - All this work is on the ibis-upgrade branch - it is all in my local repository. It is only a few files (4-5), so I can hack it into any branch. Or I need someone to tell me how to merge it into the ibis-upgrade branch without messing other things. Now realize that I should have created another branch off ibis-upgrade.

Thanks.

Sundar Mudupalli

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment