Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(sqllab): Cleanup /tables/... endpoint #21284

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Sep 1, 2022

SUMMARY

The /tables/... endpoint was augmented (#1466) when SQL Lab supported:

  1. Fetching tables/views across either a single or all schemas, i.e., the SEE TABLE SCHEMA functioned (if configured) without having a schema specified.
  2. Backend filtering.

however the endpoint is now always called with a specified schema and filtering (substring table name matching) is never invoked.

Given these observations this PR refactors the endpoint—by removing unnecessary logic and providing a more performant ORM query given that all table/views are from a specific schema—and reduces the size of the response payload to improve performance (especially for schemas with a large number of tables/views).

Additionally as a byproduct of the refactor this PR removes:

  1. The (now unused) get_all_table_names_in_database and get_all_view_names_in_database utility methods as well as the get_all_datasource_names DB engine spec method.
  2. The (now unused) dbs.allow_multi_schema_metadata_fetch column
  3. The non-functional MAX_TABLE_NAMES config key, i.e., the logic (which potentially was wrong) only capped the number of tables/views if and only if a filter was defined, however this was never invoked.
  4. The superset update-datasources-cache CLI command.
  5. Unnecessary keys in the response payload (to reduce the payload size).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

Updated unit tests and manually tested.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

cc: @justinpark

@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch 2 times, most recently from 880dcb9 to 0baba51 Compare September 1, 2022 04:18
@john-bodley john-bodley marked this pull request as ready for review September 1, 2022 04:26
@john-bodley john-bodley requested a review from a team as a code owner September 1, 2022 04:26
@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch 2 times, most recently from b7be637 to 88367ff Compare September 1, 2022 16:27
Copy link
Member

@ktmud ktmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of questions

"allow_multi_schema_metadata_fetch",
sa.Boolean(),
nullable=True,
default=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the default be false? I think enabling multi schema fetch for all data sources is a bigger risk than disabling it for all.

@expose("/tables/<int:db_id>/<schema>/<substr>/<force_refresh>/<exact_match>")
def tables( # pylint: disable=too-many-locals,no-self-use,too-many-arguments
@expose("/tables/<int:db_id>/<schema>/")
@expose("/tables/<int:db_id>/<schema>/<force_refresh>/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are here, can we change force_refresh to a search params, too?

@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch 3 times, most recently from f13fd45 to 15f0179 Compare September 5, 2022 04:47
@codecov
Copy link

codecov bot commented Sep 5, 2022

Codecov Report

Merging #21284 (f44908c) into master (094400c) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master   #21284   +/-   ##
=======================================
  Coverage   66.55%   66.56%           
=======================================
  Files        1791     1791           
  Lines       68591    68497   -94     
  Branches     7319     7319           
=======================================
- Hits        45651    45595   -56     
+ Misses      21049    21011   -38     
  Partials     1891     1891           
Flag Coverage Δ
hive 53.03% <35.71%> (+0.09%) ⬆️
javascript 52.73% <100.00%> (ø)
mysql 80.84% <100.00%> (+0.06%) ⬆️
postgres 80.89% <100.00%> (+0.06%) ⬆️
presto 52.93% <35.71%> (+0.09%) ⬆️
python 81.36% <100.00%> (+0.06%) ⬆️
sqlite 79.43% <100.00%> (+0.05%) ⬆️
unit 51.02% <35.71%> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...frontend/src/components/DatabaseSelector/index.tsx 95.23% <ø> (ø)
.../CRUD/data/database/DatabaseModal/ExtraOptions.tsx 83.33% <ø> (ø)
superset/cli/update.py 0.00% <ø> (ø)
superset/config.py 91.58% <ø> (-0.03%) ⬇️
superset/dashboards/schemas.py 99.46% <ø> (-0.01%) ⬇️
superset/databases/api.py 95.49% <ø> (ø)
superset/db_engine_specs/base.py 89.56% <ø> (+1.23%) ⬆️
superset/db_engine_specs/hive.py 87.35% <ø> (+0.24%) ⬆️
superset/db_engine_specs/presto.py 87.78% <ø> (-0.18%) ⬇️
superset/db_engine_specs/sqlite.py 96.42% <ø> (-0.87%) ⬇️
... and 6 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch from 15f0179 to 3dd8fc8 Compare September 5, 2022 05:54
@justinpark
Copy link
Member

@john-bodley please mark Removes existing feature or API since this commit removes allow_multi_schema_metadata_fetch (Allow Multi Schema Metadata Fetch) from Database option.

Copy link
Member

@justinpark justinpark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@john-bodley john-bodley merged commit eac6fdc into apache:master Sep 13, 2022
@john-bodley john-bodley deleted the john-bodley--refactor-tables-views branch September 13, 2022 15:22
@EugeneTorap
Copy link
Contributor

EugeneTorap commented Sep 14, 2022

Hi @john-bodley, the PR reproduces this tables loading issue in master branch. Can you fix it in next PR?
Can we also cover the case with a unit test?

Screenshot 2022-09-14 at 14 33 03

john-bodley added a commit to john-bodley/superset that referenced this pull request Sep 14, 2022
sadpandajoe added a commit to preset-io/superset that referenced this pull request Sep 15, 2022
@betodealmeida
Copy link
Member

betodealmeida commented Sep 16, 2022

@john-bodley It looks like this broke loading tables in SQL Lab... I'm getting 404 now when it tries to load the list.

Just saw your fix.

eschutho added a commit that referenced this pull request Sep 16, 2022
jinghua-qa pushed a commit to preset-io/superset that referenced this pull request Sep 19, 2022
Co-authored-by: Elizabeth Thompson <[email protected]>
(cherry picked from commit 8c16806)
mayurnewase added a commit to mayurnewase/incubator-superset that referenced this pull request Sep 22, 2022
@john-bodley
Copy link
Member Author

Apologies this PR did not adhere to SIP-15, i.e., the database migration should have been added in a future PR and included in the subsequent major release.

@rusackas
Copy link
Member

@john-bodley someone was just pointing out that this PR broke a sync job. I think technically since this changes the endpoints/URIs of the API, this would be considered a breaking change, namely as things move from

@expose("/tables/<int:db_id>/<schema>/<substr>/")
@expose("/tables/<int:db_id>/<schema>/<substr>/<force_refresh>/")
@expose("/tables/<int:db_id>/<schema>/<substr>/<force_refresh>/<exact_match>")

to

@expose("/tables/<int:db_id>/<schema>/")
@expose("/tables/<int:db_id>/<schema>/<force_refresh>/")
@deprecated(new_target="api/v1/database/<int:pk>/tables/")

I'm not sure if there's a way to patch this to be non-breaking at this point, but maybe adding a breaking change notice to UPDATING.md would at least raise awareness.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants