Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: updates on nested column limitations, contributing guide examples, and incorrect example #1082

Merged
merged 2 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,23 @@ To run our local testing suite, use:

`python3 -m nox --envdir ~/dvt/envs/ -s unit_small blacken lint`

You can also use [our script](tests/local_check.sh) with all checks step by step.
See [our script](tests/local_check.sh) for using nox to run tests step by step.

You can also run pytest directly:
```python
pip install pyfakefs==4.6.2
pytest tests/unit
```

To lint your code, run:
```
pip install black==22.3.0
pip install flake8
black $BLACK_PATHS # Find this variable in our noxfile
flake8 data_validation
flake8 tests
```
The above is similar to our [noxfile lint test](https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/develop/noxfile.py).

## Conventional Commits

Expand Down
9 changes: 4 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ Alternatives to running DVT in the CLI include deploying DVT to Cloud Run, Cloud
([Examples Here](https://github.com/GoogleCloudPlatform/professional-services-data-validator/tree/develop/samples)). See the [Validation Logic](https://github.com/GoogleCloudPlatform/professional-services-data-validator#validation-logic) section
to learn more about how DVT uses the CLI to generate SQL queries.

Note that we do not support nested or complex columns for column or row validations.

#### Column Validations

Below is the command syntax for column validations. To run a grouped column
Expand All @@ -98,9 +100,6 @@ data-validation (--verbose or -v) (--log-level or -ll) validate column
i.e 'bigquery-public-data.new_york_citibike.citibike_trips'
[--grouped-columns or -gc GROUPED_COLUMNS]
Comma separated list of columns for Group By i.e col_a,col_b
[--primary-keys or -pk PRIMARY_KEYS]
Comma separated list of columns to use as primary keys
(Note) Only use with grouped column validation. See *Primary Keys* section.
[--count COLUMNS] Comma separated list of columns for count or * for all columns
[--sum COLUMNS] Comma separated list of columns for sum or * for all numeric
[--min COLUMNS] Comma separated list of columns for min or * for all numeric
Expand Down Expand Up @@ -135,8 +134,8 @@ data-validation (--verbose or -v) (--log-level or -ll) validate column
Comma separated list of statuses to filter the validation results. Supported statuses are (success, fail). If no list is provided, all statuses are returned.
```

The default aggregation type is a 'COUNT *'. If no aggregation flag (i.e count,
sum , min, etc.) is provided, the default aggregation will run.
The default aggregation type is a 'COUNT *', which will run in addition to the validations you specify. To remove this default,
use [YAML configs](https://github.com/GoogleCloudPlatform/professional-services-data-validator/tree/develop#running-dvt-with-yaml-configuration-files).

The [Examples](https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/develop/docs/examples.md) page provides many examples of how a tool can be used to run powerful validations without writing any queries.

Expand Down
2 changes: 1 addition & 1 deletion docs/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,13 @@ Above command executes validations stored in a config file named citibike.yaml.
#### Generate partitions and save as multiple configuration files
````shell script
data-validation generate-table-partitions \
-sc my_bq_conn \
-tc my_bq_conn \
-tbls bigquery-public-data.new_york_trees.tree_census_2015 \
--primary-keys tree_id \
--hash '*' \
--filters 'tree_id>3000' \
-cdir partitions_dir \
--partition-key tree_id \
--partition-num 200
````
Above command creates multiple partitions based on `--partition-key`. Number of generated configuration files is decided by `--partition-num`
Expand Down