Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Addition of log level as an argument for DVT logging and replac… #577

Merged
merged 5 commits into from
Sep 2, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ over all columns ('*') will only run over numeric columns, unless the
`--wildcard-include-string-len` flag is present.

```
data-validation (--verbose or -v) validate column
data-validation (--verbose or -v) (--log-level or -ll) validate column
--source-conn or -sc SOURCE_CONN
Source connection details
See: *Data Source Configurations* section for each data source
Expand Down Expand Up @@ -155,7 +155,7 @@ Under the hood, row validation uses
apply functions such as IFNULL() or RTRIM(). These can be edited in the YAML config to customize your row validation.

```
data-validation (--verbose or -v) validate row
data-validation (--verbose or -v) (--log-level or -ll) validate row
--source-conn or -sc SOURCE_CONN
Source connection details
See: *Data Source Configurations* section for each data source
Expand Down Expand Up @@ -199,7 +199,7 @@ Below is the syntax for schema validations. These can be used to compare case in
types between source and target.

```
data-validation (--verbose or -v) validate schema
data-validation (--verbose or -v) (--log-level or -ll) validate schema
--source-conn or -sc SOURCE_CONN
Source connection details
See: *Data Source Configurations* section for each data source
Expand Down Expand Up @@ -228,7 +228,7 @@ data-validation (--verbose or -v) validate schema
Below is the command syntax for custom query column validations.

```
data-validation (--verbose or -v) validate custom-query
data-validation (--verbose or -v) (--log-level or -ll) validate custom-query
--source-conn or -sc SOURCE_CONN
Source connection details
See: *Data Source Configurations* section for each data source
Expand Down Expand Up @@ -282,7 +282,7 @@ in the SELECT statement of both source_query.sql and target_query.sql
Below is the command syntax for custom query row validations.

```
data-validation (--verbose or -v) validate custom-query
data-validation (--verbose or -v) (--log-level or -ll) validate custom-query
--source-conn or -sc SOURCE_CONN
Source connection details
See: *Data Source Configurations* section for each data source
Expand Down Expand Up @@ -336,7 +336,7 @@ Once the file is updated and saved, the following command runs the
validation:

```
data-validation configs run -c citibike.yaml
data-validation (--verbose or -v) (--log-level or -ll) configs run -c citibike.yaml
```

View the complete YAML file for a Grouped Column validation on the
Expand Down
17 changes: 14 additions & 3 deletions data_validation/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,15 @@
# by default yaml dumps lists as pointers. This disables that feature
Dumper.ignore_aliases = lambda *args: True

# Log level mappings for the input argument of log level string
LOG_LEVEL_MAP = {
"DEBUG": logging.DEBUG,
"INFO": logging.INFO,
"WARNING": logging.WARNING,
"ERROR": logging.ERROR,
"CRITICAL": logging.CRITICAL,
}


def _get_arg_config_file(args):
"""Return String yaml config file path."""
Expand Down Expand Up @@ -499,13 +508,15 @@ def validate(args):


def main():

# Create Parser and Get Deployment Info
args = cli_tools.get_parsed_args()
logging.basicConfig(
level=logging.INFO,
level=LOG_LEVEL_MAP[args.log_level],
format="%(asctime)s-%(levelname)s: %(message)s",
datefmt="%m/%d/%Y %I:%M:%S %p",
)
# Create Parser and Get Deployment Info
args = cli_tools.get_parsed_args()

if args.command == "connections":
run_connections(args)
elif args.command == "run-config":
Expand Down
7 changes: 7 additions & 0 deletions data_validation/cli_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,13 @@ def configure_arg_parser():
)

parser.add_argument("--verbose", "-v", action="store_true", help="Verbose logging")
parser.add_argument(
"--log-level",
"-ll",
default="INFO",
choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
help="Log Level to be assigned. This will print logs with level same or above",
)

subparsers = parser.add_subparsers(dest="command")
_configure_validate_parser(subparsers)
Expand Down
4 changes: 2 additions & 2 deletions data_validation/combiner.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,8 @@ def generate_report(
documented = _add_metadata(joined, run_metadata)

if verbose:
logging.info("-- ** Combiner Query ** --")
logging.info(documented.compile())
logging.debug("-- ** Combiner Query ** --")
logging.debug(documented.compile())

result_df = client.execute(documented)
result_df.validation_status.fillna(consts.VALIDATION_STATUS_FAIL, inplace=True)
Expand Down
10 changes: 7 additions & 3 deletions data_validation/result_handlers/bigquery.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@
from google.cloud import bigquery

from data_validation import client_info
from data_validation.result_handlers.text import TextResultHandler
import logging
from data_validation import consts


class BigQueryResultHandler(object):
Expand Down Expand Up @@ -55,8 +56,11 @@ def get_handler_for_project(
return BigQueryResultHandler(client, table_id=table_id)

def execute(self, config, result_df):
text_handler = TextResultHandler("table")
text_handler.print_formatted_(result_df)
logging.info(
result_df.drop(consts.COLUMN_FILTER_LIST, axis=1).to_markdown(
tablefmt="fancy_grid", index=False
)
)

table = self._bigquery_client.get_table(self._table_id)
chunk_errors = self._bigquery_client.insert_rows_from_dataframe(
Expand Down