column validation: --sum of time zoned timestamp on BigQuery has side effect on other engines #938
Labels
priority: p2
Medium priority. Fix may not be included in next release (e.g. minor documentation, cleanup)
type: bug
Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
This problem affects at least SQL Server and Teradata.
You can use the standard dvt_core_types integration test table to reproduce.
When comparing
--sum=col_tstz
on SQL Server and BigQuery then we get an epoch seconds mismatch:However when comparing SQL Server to itself then the value is 280806, which matches BigQuery.
The problem stems from these lines in
config_manager.py
:They are saying that if the source or target is BigQuery then cast both source and target to timestamp before doing the epoch seconds expression. The cast to timestamp is cropping the time zone in SQL Server and Teradata.
First thought was to change the SQL Server cast to first convert to UTC but I also think it is wrong that a BigQuery requirement changes what we do on the other engine. If BigQuery needs a pre-cast to TIMESTAMP then it probably shouldn't be catered for in both source and target queries. Or maybe both thoughts are true.
When resolved we should add col_tstz to sum validation in the
test_column_validation_core_types_to_bigquery
integration test for SQL Server and Teradata.The text was updated successfully, but these errors were encountered: