Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: dataset safe URL for explore_url #24686

Merged
merged 10 commits into from
Aug 23, 2023

Conversation

dpgaspar
Copy link
Member

SUMMARY

Use a better approach to handle custom default URLs on Datasets. Previously we were validating the URL itself and checked if it belonged to the same request host but this approach is not fail proof.

This PR makes all explore_url on datasets be handled as relative URLs, relative URLs were already supported and used on this field, now we enforce it when PREVENT_UNSAFE_DEFAULT_URLS_ON_DATASET

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Jul 13, 2023

Codecov Report

Merging #24686 (ece4a9a) into master (a50c43e) will decrease coverage by 0.02%.
The diff coverage is 71.42%.

❗ Current head ece4a9a differs from pull request most recent head 154ab3d. Consider uploading reports for the commit 154ab3d to get more accurate results

@@            Coverage Diff             @@
##           master   #24686      +/-   ##
==========================================
- Coverage   68.99%   68.98%   -0.02%     
==========================================
  Files        1903     1903              
  Lines       74077    74054      -23     
  Branches     8194     8196       +2     
==========================================
- Hits        51112    51085      -27     
- Misses      20844    20847       +3     
- Partials     2121     2122       +1     
Flag Coverage Δ
hive 54.15% <100.00%> (+0.01%) ⬆️
javascript 55.83% <60.00%> (-0.01%) ⬇️
mysql 79.20% <100.00%> (-0.01%) ⬇️
postgres 79.30% <100.00%> (-0.01%) ⬇️
presto 54.05% <100.00%> (+0.01%) ⬆️
python 83.35% <100.00%> (-0.01%) ⬇️
sqlite 77.89% <100.00%> (+<0.01%) ⬆️
unit 54.98% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
superset/config.py 92.17% <ø> (ø)
superset/datasets/commands/exceptions.py 93.93% <ø> (-0.27%) ⬇️
superset/datasets/commands/update.py 93.75% <ø> (-0.37%) ⬇️
superset/views/base.py 73.24% <ø> (ø)
superset-frontend/src/pages/DatasetList/index.tsx 57.57% <60.00%> (-0.19%) ⬇️
superset/utils/urls.py 100.00% <100.00%> (+2.77%) ⬆️
superset/views/datasource/views.py 93.38% <100.00%> (-0.22%) ⬇️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@dpgaspar dpgaspar requested a review from kgabryje July 13, 2023 13:36
@dpgaspar dpgaspar requested a review from kgabryje July 14, 2023 13:28
@john-bodley
Copy link
Member

@dpgaspar I was wondering whether there would be merit in adding some frontend tests given that it seems the explore URL logic is handled completely by the frontend.

@@ -41,6 +41,7 @@ assists people when migrating to a new version.

### Breaking Changes

- [24686]https://github.com/apache/superset/pull/24686): All dataset's custom explore_url are handled as relative URLs on the frontend, behaviour controlled by PREVENT_UNSAFE_DEFAULT_URLS_ON_DATASET.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgaspar since we were already validating that all explore_urls were of the same domain, what would possibly break with this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All explore_url is handled as relative, so existing URLs like: http://www.google.com will result on the following link: http://localhost:8088/tablemodelview/list/http://www.google.com
Previously we were supporting both types, absolute URL and relative, internally when no value was provided we handled the default as relative (dataset link to explore)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way that we can make this somewhat backwards compatible? Maybe a migration that trims off the scheme/domain/host and attempts to make the url into a relative one if it can? Like if ./explore. is in the string at least?
Otherwise, we may want to consider this PR for a major version bump. cc @michael-s-molina for thoughts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do think this breaks backward compatibility, but users can still opt in for the old behaviour, that I think it's safer and more explicit/clearer then changing metadata with a db migration.
Ultimately I do question the utility of allowing users to set their own custom default URL for datasets.

@michael-s-molina is it possible to include this one on 3.0.0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, perfect, then, I tagged it for 3.0.

@dpgaspar dpgaspar requested a review from eschutho July 28, 2023 11:16
@dpgaspar dpgaspar added v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch and removed v2.1 2.1.1 labels Jul 28, 2023
@dpgaspar
Copy link
Member Author

@dpgaspar I was wondering whether there would be merit in adding some frontend tests given that it seems the explore URL logic is handled completely by the frontend.

yes, added frontend tests

@michael-s-molina michael-s-molina removed the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 16, 2023
@dpgaspar dpgaspar added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 17, 2023
@john-bodley john-bodley removed the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 22, 2023
@eschutho eschutho added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 22, 2023
@michael-s-molina michael-s-molina merged commit a9efd4b into apache:master Aug 23, 2023
29 checks passed
michael-s-molina pushed a commit that referenced this pull request Aug 23, 2023
@eschutho eschutho added the risk:breaking-change Issues or PRs that will introduce breaking changes label Oct 2, 2023
@mistercrunch mistercrunch added 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels risk:breaking-change Issues or PRs that will introduce breaking changes size/L v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants