Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent large objects from being stored in the RTIF #38094

Merged
merged 16 commits into from
Mar 25, 2024

Conversation

ephraimbuddy
Copy link
Contributor

There's no control over the size of objects stored in the rendered taskinstance field. This PR adds control and enable users to be able to customize the size of data that can be stored in this field

closes: #28199

@ephraimbuddy ephraimbuddy force-pushed the limit-templated-field-size branch 2 times, most recently from bdd600e to 6a10c94 Compare March 14, 2024 17:23
@ephraimbuddy ephraimbuddy force-pushed the limit-templated-field-size branch 5 times, most recently from 678cb87 to ffc67d1 Compare March 21, 2024 07:23
@ephraimbuddy
Copy link
Contributor Author

ephraimbuddy commented Mar 21, 2024

There's something wrong with MySQL around templated fields. cc @uranusjr .

Oh, seen another test failure that's not mysql

@ephraimbuddy
Copy link
Contributor Author

I will like to include this in the next beta/rc, please review @uranusjr @jedcunningham @potiuk

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. with one nit - but up to you to address it.

airflow/config_templates/config.yml Outdated Show resolved Hide resolved
airflow/serialization/helpers.py Outdated Show resolved Hide resolved
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. The prefix is cool.

There is one worry with that U just wanted to mention it.

I understand that we are masking secreates before serialization ? Just wanted to amke sure sure of it - it looks like from the code , but I wanted to make sure we are not showing half-unmasked secrets accidentally when we truncate the secrets.

@ephraimbuddy
Copy link
Contributor Author

Nice. The prefix is cool.

There is one worry with that U just wanted to mention it.

I understand that we are masking secreates before serialization ? Just wanted to amke sure sure of it - it looks like from the code , but I wanted to make sure we are not showing half-unmasked secrets accidentally when we truncate the secrets.

The secrets are masked after serialization see:

. I have added a test and will also do manual testing

@ephraimbuddy
Copy link
Contributor Author

ephraimbuddy commented Mar 22, 2024

Nice. The prefix is cool.
There is one worry with that U just wanted to mention it.
I understand that we are masking secreates before serialization ? Just wanted to amke sure sure of it - it looks like from the code , but I wanted to make sure we are not showing half-unmasked secrets accidentally when we truncate the secrets.

The secrets are masked after serialization see:

. I have added a test and will also do manual testing

It doesn't work so
Resolved

@ephraimbuddy ephraimbuddy force-pushed the limit-templated-field-size branch 3 times, most recently from 11843f5 to 987d980 Compare March 24, 2024 00:08
@ephraimbuddy ephraimbuddy merged commit 4207099 into apache:main Mar 25, 2024
46 checks passed
@ephraimbuddy ephraimbuddy deleted the limit-templated-field-size branch March 25, 2024 08:35
@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Mar 27, 2024
utkarsharma2 pushed a commit to astronomer/airflow that referenced this pull request Apr 22, 2024
* Prevent large objects from being stored in the RTIF

There's no control over the size of objects stored in the rendered taskinstance
field. This PR adds control and enable users to be able to customize the size of
data that can be stored in this field

closes: apache#28199

* fixup! Prevent large objects from being stored in the RTIF

* Use len and check the size of the serialized

* Add test to already existing test

* Remove xcom db clearing

* Update airflow/config_templates/config.yml

Co-authored-by: Jed Cunningham <[email protected]>

* Apply review suggestions

* Add test for redacting values and add significant item

* Prefix with Truncated line

* Ensure secrets are masked

* fixup! Ensure secrets are masked

* Check the template field length in separate branches for jsonable and nonjsonable

* add tests for rendered dataframe

* Apply suggestions from code review

* update code and tests

* fixup! update code and tests

---------

Co-authored-by: Jed Cunningham <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prevent large objects from being stored in RenderedTaskInstanceFields
5 participants