Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: GCS support for validation configs #340

Merged
merged 8 commits into from
Feb 17, 2022
Merged

Conversation

dmedora
Copy link
Member

@dmedora dmedora commented Nov 2, 2021

Resolves #288. Allows validation configs to be saved to GCS when the PSO_DV_CONFIG_HOME env var is set. Also adds 'list' functionality and corresponding docs updates.

@@ -204,6 +201,9 @@ def _configure_run_config_parser(subparsers):
run_config_parser = subparsers.add_parser(
"run-config", help="Run validations stored in a YAML config file"
)
run_config_subparsers = run_config_parser.add_subparsers(dest="run_config_cmd")
_ = run_config_subparsers.add_parser("list", help="List your validation configs")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without having tested this, using an optional command parser is a slightly strange feature (and I'm not sure its doable?).

The naming structure itself here is slightly strange as well (eg run-config list is not easy to find and understand)

It seems to me that the goal here is to create a configs CLI section like we have with connections. If thats the case perhaps do so more explicitly.
configs list & configs run & also maybe add configs get though only a nice to have.

For backwards compat you'll need to keep run-config though perhaps it can be a hidden config going forward?

Copy link
Member Author

@dmedora dmedora Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, run-config list is a little strange. I was trying to stick with run-config, but adding a configs makes sense. Will amend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the configs command (run, list, get), and left run-config in place. PTAL when you have a chance!

Copy link
Collaborator

@nehanene15 nehanene15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some edits to the state manager that may affect GCS based connections. Besides that, LGTM

@dmedora dmedora merged commit b09cd29 into develop Feb 17, 2022
@dmedora dmedora deleted the issue288-gcs-for-configs branch February 17, 2022 19:04
ngdav pushed a commit that referenced this pull request Mar 16, 2022
* gcs support for validation configs, incl. get and list functionality, and new 'configs' cmd
ngdav added a commit that referenced this pull request May 4, 2022
* feat: add db2 connection

* feat: add connection

* feat: DB2 connection fix

* fix: do not require db2 client unless needed

* fix: Db2 count validation/agg functions, DB2Client

fixes sum, min, avg, max functions for mysql, ps, db2, and more
streamline DB2Client imports

* style: linting

* Fix: Multiple updates (#359)

* fix: update spelling

* fix:Adding double quote to prevent globbing and word splitting.

Adding double quote to prevent globbing and word splitting.

* fix:updating comment

* fix: Updating inline comments

* fix:Spelling

* fix:Updating spelling

* test: Support local integration tests for Teradata, Postgres and SQL Server (#364)

* test: get Teradata user name from TERADATA_USER env var

* test: add --no-cloud-sql flag to pytest options

* test: instantiate CloudSQLResourceManager in a fixture when --no-cloud-sql is not passed

* test: optionally get Postgres host from POSTGRES_HOST env var

* test: optionally get SQL Server host from SQL_SERVER_HOST env var

* test: optionally get SQL server user from SQL_SERVER_USER env var

Co-authored-by: A.J. Welch <[email protected]>

* fix: supporting non default schemas for mssql (#365)

* fix: supporting non default schemas for mssql

* fix:updated MSSQL client instantiation

* fix: typo

* feat: GCS support for validation configs (#340)

* gcs support for validation configs, incl. get and list functionality, and new 'configs' cmd

* fix: test for nan when calculating fail/success in combiner (#341) (#366)

* fix: ensure all statuses are success or fail, particularly after _join_pivots (#329) (#370)

* feat: first class support for row level hashing (#345)

* adding scaffolding for calc field builder in config manager

* exposing cast via calculated fields. Don't know if we necessarily need this just adding for consistency

* diff check

* config file generating as expected

* expanding cli for row level validations

* splitting out comparison fields from aggregates

* row comparisons operational (sort of)

* re-enabling aggregate validations

* cohabitation of validation types!

* figuring out why unit tests are borked

* continuing field split

* stash before merge

* testing diff

* tests passing

* removing extra print statements

* tests and lint

* adding fail tests

* first round of requested changes

* change requests round two.

* refactor CLI and lint

* swapping out farm fingerprint for sha256 as default

* changes per CR

* fixing text result tests

* adding docs

* hash example

* linting

* think I found the broken test

* fixed tests

* setting default for depth length

* relaxing system test

* feat: Hive partitioned tables support (#375)

* feat: add support for partitioned tables

* feat: import schema class

* fix: update docs

* fix: use an appropriate column filter list for schema validation (#350) (#371)

* fix: make status values consistent across validation types (#377) (#378)

* fix: make status values consistent across validation types (#377)

* fix: make validation status values consts (#377)

* fix: revert change from #345 that causes filters, threshold and labels to be ignored for column validations (#376) (#379)

* feat: Hive hash function support (#392)

* adding addons for impala hive hashing functions

* fix: import fixed_arity

* move logic to ibis_addon

* replacing isnull with nvl

* adding nvl function

* test FillNa

* missing import

* updating t0 prefix to column names



Co-authored-by: Mike Hilton <[email protected]>

* docs: add Db2 link to README

Co-authored-by: Elaina Yao <[email protected]>
Co-authored-by: David Ng <[email protected]>
Co-authored-by: Alejandro Leal <[email protected]>
Co-authored-by: AJ <[email protected]>
Co-authored-by: A.J. Welch <[email protected]>
Co-authored-by: Neha Nene <[email protected]>
Co-authored-by: dmedora <[email protected]>
Co-authored-by: Mike Hilton <[email protected]>
Co-authored-by: ngdav <[email protected]>
Co-authored-by: Dylan Hercher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GCS support for configurations/connections
3 participants