Skip to content

Commit

Permalink
docs: Updates on nested column limitations, contributing guide exampl…
Browse files Browse the repository at this point in the history
…es and incorrect example (#1082)

* docs: cleanup

* docs: typo
  • Loading branch information
nehanene15 committed Jan 22, 2024
1 parent 15bfc4c commit cc0f60a
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 7 deletions.
18 changes: 17 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,23 @@ To run our local testing suite, use:

`python3 -m nox --envdir ~/dvt/envs/ -s unit_small blacken lint`

You can also use [our script](tests/local_check.sh) with all checks step by step.
See [our script](tests/local_check.sh) for using nox to run tests step by step.

You can also run pytest directly:
```python
pip install pyfakefs==4.6.2
pytest tests/unit
```

To lint your code, run:
```
pip install black==22.3.0
pip install flake8
black $BLACK_PATHS # Find this variable in our noxfile
flake8 data_validation
flake8 tests
```
The above is similar to our [noxfile lint test](https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/develop/noxfile.py).

## Conventional Commits

Expand Down
9 changes: 4 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ Alternatives to running DVT in the CLI include deploying DVT to Cloud Run, Cloud
([Examples Here](https://github.com/GoogleCloudPlatform/professional-services-data-validator/tree/develop/samples)). See the [Validation Logic](https://github.com/GoogleCloudPlatform/professional-services-data-validator#validation-logic) section
to learn more about how DVT uses the CLI to generate SQL queries.

Note that we do not support nested or complex columns for column or row validations.

#### Column Validations

Below is the command syntax for column validations. To run a grouped column
Expand All @@ -98,9 +100,6 @@ data-validation (--verbose or -v) (--log-level or -ll) validate column
i.e 'bigquery-public-data.new_york_citibike.citibike_trips'
[--grouped-columns or -gc GROUPED_COLUMNS]
Comma separated list of columns for Group By i.e col_a,col_b
[--primary-keys or -pk PRIMARY_KEYS]
Comma separated list of columns to use as primary keys
(Note) Only use with grouped column validation. See *Primary Keys* section.
[--count COLUMNS] Comma separated list of columns for count or * for all columns
[--sum COLUMNS] Comma separated list of columns for sum or * for all numeric
[--min COLUMNS] Comma separated list of columns for min or * for all numeric
Expand Down Expand Up @@ -135,8 +134,8 @@ data-validation (--verbose or -v) (--log-level or -ll) validate column
Comma separated list of statuses to filter the validation results. Supported statuses are (success, fail). If no list is provided, all statuses are returned.
```

The default aggregation type is a 'COUNT *'. If no aggregation flag (i.e count,
sum , min, etc.) is provided, the default aggregation will run.
The default aggregation type is a 'COUNT *', which will run in addition to the validations you specify. To remove this default,
use [YAML configs](https://github.com/GoogleCloudPlatform/professional-services-data-validator/tree/develop#running-dvt-with-yaml-configuration-files).

The [Examples](https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/develop/docs/examples.md) page provides many examples of how a tool can be used to run powerful validations without writing any queries.

Expand Down
2 changes: 1 addition & 1 deletion docs/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,13 @@ Above command executes validations stored in a config file named citibike.yaml.
#### Generate partitions and save as multiple configuration files
````shell script
data-validation generate-table-partitions \
-sc my_bq_conn \
-tc my_bq_conn \
-tbls bigquery-public-data.new_york_trees.tree_census_2015 \
--primary-keys tree_id \
--hash '*' \
--filters 'tree_id>3000' \
-cdir partitions_dir \
--partition-key tree_id \
--partition-num 200
````
Above command creates multiple partitions based on `--partition-key`. Number of generated configuration files is decided by `--partition-num`
Expand Down

0 comments on commit cc0f60a

Please sign in to comment.