Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix decimal separator to "." (dot) on Oracle #1042

Merged
merged 1 commit into from
Nov 11, 2023

Conversation

nj1973
Copy link
Contributor

@nj1973 nj1973 commented Nov 10, 2023

Setting NLS_NUMERIC_CHARACTERS to '.,' when connecting to Oracle to ensure we get . as the decimal separator when converting decimals to string. This matches PostgreSQL and BigQuery.

I've been trying to manually test if the same applies on Teradata but cannot figure out how to setup a failure in order to fix it. I suspect we'll have similar issues on other engines in the future but this PR is restricted to Oracle.

I've also been unable to write a test for this. I added a test to test_oracle.py that failed without my fix and passed with it but, unfortunately, when I run all tests the new test passed even without the fix. I think this is because os.environ is captured when cx-Oracle is imported, when my test ran alone then my manipulation of os.environ worked. When my test ran amongst other tests my setup was not effective.

I've captured the test below for posterity but am submitting this PR based on manual testing alone.

@mock.patch(
    "data_validation.state_manager.StateManager.get_connection_config",
    new=mock_get_connection_config,
)
def test_row_validation_core_types_to_bigquery_fr_FR():
    """Oracle to BigQuery row comparison-fields where Oracle locale does not use . for decimal separator."""
    parser = cli_tools.configure_arg_parser()
    nls_lang = os.environ.get('NLS_LANG')
    os.environ['NLS_LANG'] = 'FRENCH_FRANCE.AL32UTF8'
    args = parser.parse_args(
        [
            "validate",
            "row",
            "-sc=ora-conn",
            "-tc=bq-conn",
            "-tbls=pso_data_validator.dvt_core_types",
            "--primary-keys=id",
            "--filter-status=fail",
            "--hash=id,col_dec_10_2",
        ]
    )
    config_managers = main.build_config_managers_from_args(args)
    assert len(config_managers) == 1
    config_manager = config_managers[0]
    validator = data_validation.DataValidation(config_manager.config, verbose=False)
    df = validator.execute()
    # Return NLS_LANG to its previous setting.
    if nls_lang:
        os.environ['NLS_LANG'] = nls_lang
    else:
        del os.environ['NLS_LANG']
    # With filter on failures the data frame should be empty.
    assert len(df) == 0

@nj1973 nj1973 requested a review from a team as a code owner November 10, 2023 17:13
@nj1973
Copy link
Contributor Author

nj1973 commented Nov 10, 2023

/gcbrun

Copy link
Contributor

@helensilva14 helensilva14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for testing this out Neil!

@nj1973 nj1973 merged commit 14cc7ef into develop Nov 11, 2023
5 checks passed
@nj1973 nj1973 deleted the fix/1033-decimal-separator branch November 11, 2023 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

validate custom-query row: cannot validate a decimal column from Oracle source
2 participants