Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to force refresh metadata #5933

Merged

Conversation

youngyjd
Copy link
Contributor

@youngyjd youngyjd commented Sep 19, 2018

Several changes in this PR:

  1. Introduced "metadata_cache_timeout": {} setting in extra of a database configuration. If schema_cache_timeout is unset in the field, cache will not be enabled for the metadata fetch of this database. If it is set to 0, cache will never time out.
  2. Added a button to sqllab UI which allows user to refetch schemas metadata and force refresh cache if cache is enabled.
  3. Added a new component which is called RefreshLabel. It is similar to CacheLabel but simpler. Not sure if we need to integrate the two labels into one component and not sure how to implement if so. Any suggestion is welcome.

Next step:

Implement similar things on fetching table list.

screen shot 2018-09-19 at 12 50 26 pm

@youngyjd youngyjd changed the title Allow user to force refresh metadata Allow user to force refresh metadata and Cache Hive schema metadata Sep 19, 2018
@youngyjd youngyjd force-pushed the introduce-cache-in-metadata-fetch branch 5 times, most recently from 5c4ea1b to 8ef4c71 Compare September 20, 2018 03:29
@codecov-io
Copy link

codecov-io commented Sep 20, 2018

Codecov Report

Merging #5933 into master will increase coverage by 14.31%.
The diff coverage is 63.63%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #5933       +/-   ##
===========================================
+ Coverage   63.52%   77.83%   +14.31%     
===========================================
  Files         393       46      -347     
  Lines       23667     9434    -14233     
  Branches     2638        0     -2638     
===========================================
- Hits        15034     7343     -7691     
+ Misses       8620     2091     -6529     
+ Partials       13        0       -13
Impacted Files Coverage Δ
superset/models/core.py 85.1% <100%> (+0.09%) ⬆️
superset/db_engine_specs.py 55.87% <50%> (+1.04%) ⬆️
superset/cache_util.py 54.28% <56.25%> (+0.11%) ⬆️
superset/views/core.py 73.9% <60%> (+0.16%) ⬆️
superset/utils.py 88.39% <0%> (-0.9%) ⬇️
superset/connectors/sqla/views.py 64.34% <0%> (ø) ⬆️
superset/views/base.py 68% <0%> (ø) ⬆️
superset/viz.py 81.04% <0%> (ø) ⬆️
superset/assets/src/components/Checkbox.jsx
... and 350 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5811a26...7e97ac8. Read the comment docs.

@youngyjd
Copy link
Contributor Author

@youngyjd youngyjd changed the title Allow user to force refresh metadata and Cache Hive schema metadata Allow user to force refresh metadata Sep 20, 2018
@youngyjd youngyjd force-pushed the introduce-cache-in-metadata-fetch branch from e589031 to 6191dbd Compare September 20, 2018 21:59
@youngyjd youngyjd force-pushed the introduce-cache-in-metadata-fetch branch 7 times, most recently from 14b7eac to adb42e6 Compare September 24, 2018 18:32
@youngyjd youngyjd force-pushed the introduce-cache-in-metadata-fetch branch from adb42e6 to 8416ca8 Compare September 24, 2018 20:51
Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, @youngyjd! I have just a couple comments, and it would be good to preserve the old endpoint.

Also — and this can be done on a separate PR, just curious — how hard would it be to show in the tooltip when the schemas were cached?

@@ -332,7 +332,7 @@ commands are invoked.
We use [Mocha](https://mochajs.org/), [Chai](http://chaijs.com/) and [Enzyme](http://airbnb.io/enzyme/) to test Javascript. Tests can be run with:

```bash
cd superset/assets/javascripts
cd superset/assets/spec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this!

db_id = int(db_id)
force_refresh = force_refresh.lower() == 'true'
print(force_refresh)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dd

@@ -211,7 +211,12 @@ class DatabaseView(SupersetModelView, DeleteMixin, YamlExportMixin): # noqa
'gets unpacked into the [sqlalchemy.MetaData]'
'(http://docs.sqlalchemy.org/en/rel_1_0/core/metadata.html'
'#sqlalchemy.schema.MetaData) call.<br/>'
'2. The ``schemas_allowed_for_csv_upload`` is a comma separated list '
'2. The ``metadata_cache_timeout`` is a cache timeout setting '
'in second for metadata fetch of this database. Specify it as '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/second/seconds/

enable_cache=enable_cache,
cache_timeout=(self.get_extra().
get('metadata_cache_timeout', {}).
get('schema_cache_timeout')),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hard to read — at first I thought these were 3 arguments being passed, and get was a function.

Personally I would rewrite this as:

extra = self.get_extra()
try:
    cache_timeout = extra['metadata_cache_timeout']['schema_cache_timeout']
except KeyError:
    cache_timeout = None

There's a proposal that would allow this to be written as:

cache_timeout = self.get_extra()?['metadata_cache_timeout']?['schema_cache_timeout']



def view_cache_key(*unused_args, **unused_kwargs):
args_hash = hash(frozenset(request.args.items()))
return 'view/{}/{}'.format(request.path, args_hash)


def memoized_func(timeout=5 * 60, key=view_cache_key):
def default_timetout(*unused_args, **unused_kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default_timeout

return True


def memoized_func(timeout=default_timetout,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default_timeout

@expose('/schemas/<db_id>/')
def schemas(self, db_id):
@expose('/schemas/<db_id>/<force_refresh>/')
def schemas(self, db_id, force_refresh):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to preserve the old behavior, just in case something is using the API. I think you can do here:

@expose('/schemas/<db_id>/<force_refresh>/')
@expose('/schemas/<db_id>/')
def schemas(self, db_id, force_refresh=True):

And it would handle both cases.

@youngyjd
Copy link
Contributor Author

youngyjd commented Oct 8, 2018

Comments addressed. PTAL @betodealmeida

@betodealmeida
Copy link
Member

Awesome, thanks! :)

@betodealmeida betodealmeida merged commit 712c1aa into apache:master Oct 9, 2018
betodealmeida pushed a commit to lyft/incubator-superset that referenced this pull request Oct 12, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint

(cherry picked from commit 712c1aa)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint

(cherry picked from commit 712c1aa)
(cherry picked from commit a3441c1)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint

(cherry picked from commit 712c1aa)
(cherry picked from commit a3441c1)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint

(cherry picked from commit 712c1aa)
(cherry picked from commit a3441c1)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Nov 2, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint

(cherry picked from commit 712c1aa)
(cherry picked from commit a3441c1)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Nov 2, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint

(cherry picked from commit 712c1aa)
(cherry picked from commit a3441c1)
bipinsoniguavus pushed a commit to ThalesGroup/incubator-superset that referenced this pull request Dec 26, 2018
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.28.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.28.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants