Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to create first/last aggregates: cannot execute CREATE FUNCTION in a read-only transaction #1035

Closed
aliashfaq946 opened this issue Nov 8, 2023 · 8 comments · Fixed by #1067
Assignees
Labels
type: question Request for information or clarification. Not an issue.

Comments

@aliashfaq946
Copy link

aliashfaq946 commented Nov 8, 2023

Hi,

I have a scenario in which data is being synced from postgress database into Google Big Query using data streams. I am trying to use data validation tool (DVT) to perform schema, row and column validation. I am getting below error while running schema validation

Schame Validation Command:
data-validation validate schema --source-conn grafana-pg --target-conn bq-conn --tables-list pg_table_1=project_name.pg_datastream.bq_table_1,pg_table_2=project_name.pg_datastream.bq_table_2,pg_table_3=project_name.pg_datastream.bq_table_3 --allow-list !string:string,!date:date,!int32:int64,decimal(16,2):float64,!timestamp('UTC'):timestamp('UTC'),!int64:int64,!decimal(16,2):float64,int32:int64,!boolean:boolean,!binary:binary,!decimal(16,2):decimal(38,9),decimal(16,2):decimal(38,9),int16:int64,!uuid:string

Error:
C:\Users\Lenovo\AppData\Local\Programs\Python\Python311\Lib\site-packages\ibis\backends\postgres_init_.py:162: UserWarning: Unable to create first/last aggregates: cannot execute CREATE FUNCTION in a read-only transaction
warnings.warn(f"Unable to create first/last aggregates: {e}")
Traceback (most recent call last):
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1910, in _execute_context
self.dialect.do_execute(
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
cursor.execute(statement, parameters)
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

sqlalchemy.exc.InternalError: (psycopg2.errors.InFailedSqlTransaction) current transaction is aborted, commands ignored until end of transaction block

[SQL:
SELECT c.oid
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE (n.nspname = %(schema)s)
AND c.relname = %(table_name)s AND c.relkind in
('r', 'v', 'm', 'f', 'p')
]

I am not abe to understand what function it is trying to run and does it do create function step in all types of validations?

Thanks

@helensilva14 helensilva14 added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. type: question Request for information or clarification. Not an issue. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Nov 8, 2023
@helensilva14 helensilva14 self-assigned this Nov 8, 2023
@helensilva14
Copy link
Contributor

Hi @aliashfaq946, thanks for reaching out! DVT doesn't create SQL functions so it doesn't seem to be a issue related to our tool. Are you able to execute this SQL query you posted normally directly to your Postgres database?

Here I could find some situations that you can check it out, regardless if it's on-prem or hosted with a cloud provider: https://stackoverflow.com/questions/31186414/error-cannot-execute-create-table-in-a-read-only-transaction.

Could you please take a look? Let us know how it goes!

@aliashfaq946
Copy link
Author

Hi @helensilva14 ,I am wondering why it does print the message in error,
C:\Users\Lenovo\AppData\Local\Programs\Python\Python311\Lib\site-packages\ibis\backends\postgres_init_.py:162: UserWarning: Unable to create first/last aggregates: cannot execute CREATE FUNCTION in a read-only transaction
warnings.warn(f"Unable to create first/last aggregates: {e}")

Also, Yes that query is executable on postgress but it returns an integer value which makes no sense to me because in case of schema validation, i would expect if data types are matching or not.

Thanks

@helensilva14 helensilva14 assigned nj1973 and unassigned helensilva14 Nov 10, 2023
@helensilva14
Copy link
Contributor

Hi @aliashfaq946! I've been investigating your scenario and couldn't get conclusions so far, but @nj1973 will jump in now and also take a look on it. We'll provide updates once we have it, thanks!

@nj1973
Copy link
Contributor

nj1973 commented Nov 17, 2023

Hi @aliashfaq946,

I hadn't noticed these CREATE FUNCTION commands before. They appear to be used internally by Ibis and are not something we explicitly create. I'll look into why they are there.

I created a very basic user in PostgreSQL with only SELECT TABLE privileges and that account could still run validations so I'm guessing the PostgreSQL database is in some kind of read only state.

Is the PostgreSQL system a read only replica?

@aliashfaq946
Copy link
Author

aliashfaq946 commented Nov 22, 2023 via email

@nj1973
Copy link
Contributor

nj1973 commented Dec 1, 2023

Hi @aliashfaq946, my question about the read replica was to make sure I understood why the error was being triggered.

Our intention would be that using DVT against a read replica is a good thing to do. I will look into a workaround for this next week.

@nehanene15 nehanene15 self-assigned this Dec 5, 2023
@nj1973
Copy link
Contributor

nj1973 commented Dec 5, 2023

Note to self:

This is interesting. As we already understand, the CREATE FUNCTION commands fail on a read replica. However Ibis catches this and logs a warning:

            with conn.connection.cursor() as cur:
                try:
                    cur.execute(_CREATE_FIRST_LAST_AGGS_SQL)
                except Exception as e:  # noqa: BLE001
                    # a user may not have permissions to create funtions and/or aggregates
                    warnings.warn(f"Unable to create first/last aggregates: {e}")

Therefore there should be no problem, the problem seems to be with PostgreSQL transaction management: https://stackoverflow.com/a/11366092

I think our connections should be started in autocommit mode, if I set isolation_level="AUTOCOMMIT" in ibis/backends/postgres/__init__.py then commands work on a read replica:

        engine = sa.create_engine(
            alchemy_url, connect_args=connect_args, poolclass=sa.pool.StaticPool, isolation_level="AUTOCOMMIT"
        )

So we need to figure out the best way to override the Ibis connection code. To be continued...

@nehanene15
Copy link
Collaborator

A similar error was reported:
Unable to create first/last aggregates: DDL statements are only allowed outside explicit transactions.

@nj1973 That's what I found as well! I'm working on a branch to override the do_connect() method for Postgres to avoid issuing the CREATE FUNCTION queries in the first place. Should have a PR ready soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: question Request for information or clarification. Not an issue.
Projects
None yet
4 participants