Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement filtering options for the URI type #413

Closed
Tracked by #246
kgodey opened this issue Jul 16, 2021 · 12 comments · Fixed by #1079
Closed
Tracked by #246

Implement filtering options for the URI type #413

kgodey opened this issue Jul 16, 2021 · 12 comments · Fixed by #1079
Assignees
Labels
ready Ready for implementation type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL

Comments

@kgodey
Copy link
Contributor

kgodey commented Jul 16, 2021

Problem

We need to ensure that records that include columns of the custom URI type support the following filters via API:

  • contains {x}
  • does not contain {x}
  • is {x}
  • is not {x}
  • is empty
  • is not empty
  • host contains {x}
  • host does not contain {x}
  • scheme is [select from list of choices of valid schemes]
  • scheme is not [select from list of choices of valid schemes]

The ones in bold are unique to the URI type.

This involves:

  • Implementing the filters in the backend
  • Updating the /api/v0/databases/<id>/types/ endpoint to store available filters on this type
    • Filter information should include the number of parameters needing to be passed in (e.g. is needs 1 parameter, is empty needs 0)
    • Filter information should also include the list of valid choices for the filter (e.g. scheme filters)

Additional context

@kgodey kgodey added this to the 07. Initial Data Types milestone Jul 16, 2021
@kgodey kgodey added ready Ready for implementation type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL work: database needs: unblocking Blocked by other work and removed ready Ready for implementation labels Jul 16, 2021
@kgodey kgodey added ready Ready for implementation and removed needs: unblocking Blocked by other work labels Nov 11, 2021
@dmos62
Copy link
Contributor

dmos62 commented Dec 14, 2021

... [select from list of choices of valid schemes]

The set of valid schemes is unknown. There's a set of "official" URI schemes, but a scheme can be unofficial too. I think we'll have to ask the user to input a scheme to filter against himself.

@dmos62 dmos62 self-assigned this Dec 14, 2021
@kgodey
Copy link
Contributor Author

kgodey commented Dec 14, 2021

@dmos62 For the first version, let's use the official schemes. We want the UI to be user friendly, a non-technical user might not know what a scheme is so we should provide them some options.

@dmos62
Copy link
Contributor

dmos62 commented Dec 15, 2021

@kgodey how should we expose the supported schemes to the client-side? Maybe hardcode a set on the clientside? Alternatively, we could introduce something like a /known-uri-schemes endpoint.

Before I said "the set of valid schemes is unknown". I realised that's not true. Any scheme used in the filtered column is valid. So, a more generalizable feature would be to expose information on things like unique schemes used in this URI column, or unique domains used in that email column. Whatever we might use for equality filtering. Then use that for autocomplete. That would probably be most useful to the user.

@dmos62
Copy link
Contributor

dmos62 commented Dec 15, 2021

Quick update: sqlalchemy-filters don't support a more complex matching scenario like host contains {x}, since only like (and ilike) expressions are supported, which don't have matching expressive enough to differrentiate which of the two string.com occurances in scheme://string.com/scheme://string.com/xyz is the domain. That's presuming that we support non-escaped URIs.

scheme is {x} filtering is supported, since we can use a starts_with type of query on the URI for that.

A solution is to extend sqlalchemy-filters. Or, drop it altogether and reimplement filtering inside mathesar, but that would not be less effort in the short term.

@kgodey
Copy link
Contributor Author

kgodey commented Dec 16, 2021

Before I said "the set of valid schemes is unknown". I realised that's not true. Any scheme used in the filtered column is valid. So, a more generalizable feature would be to expose information on things like unique schemes used in this URI column, or unique domains used in that email column. Whatever we might use for equality filtering. Then use that for autocomplete. That would probably be most useful to the user.

I agree with this, perhaps it could be metadata on the column object in the column endpoint. I don't think we need anything else like a hardcoded list or a separate endpoint.

which don't have matching expressive enough to differrentiate which of the two string.com occurances in scheme://string.com/scheme://string.com/xyz is the domain. That's presuming that we support non-escaped URIs.

Can't you get which part is the domain from the type itself?

@dmos62
Copy link
Contributor

dmos62 commented Dec 16, 2021

@kgodey that's a good point. I'm looking into how to use sqlalchemy-filters with SQL functions.

@dmos62
Copy link
Contributor

dmos62 commented Dec 16, 2021

I did not find a way to use sqlalchemy-filters with SQL functions (namely, our custom SQL function that returns a URI's authority). Unless I've missed something, I'll be reimplementing sqlalchemy-filters. I could also extend it, but I don't think there's a good reason to continue using it.

@kgodey
Copy link
Contributor Author

kgodey commented Dec 16, 2021

@dmos62 Hopefully you can reuse some code from sqlalchemy-filters to make the implementation go faster.

@mathemancer
Copy link
Contributor

I did not find a way to use sqlalchemy-filters with SQL functions (namely, our custom SQL function that returns a URI's authority). Unless I've missed something, I'll be reimplementing sqlalchemy-filters. I could also extend it, but I don't think there's a good reason to continue using it.

The way to do this with sqlalchemy-filters would be to create a CTE with a function result in a column, then filter the CTE using the normal sqlalchemy-filters method. It's debatable whether that's a good idea, but it's possible.

@mathemancer
Copy link
Contributor

@dmos62 I assume you noticed the DB function that is installed with the URI type that returns the URI pieces (according to the RFC). The regex used is directly from https://datatracker.ietf.org/doc/html/rfc3986/#appendix-B . Combined with my previous comment, you should be able to achieve pretty expressive filtering.

@dmos62
Copy link
Contributor

dmos62 commented Dec 17, 2021

@mathemancer yeah, I was looking for a way to fully specify the filter via sqlalchemy-filters specification format, which doesn't seem possible if you need to use SQL functions. sqlalchemy-filters doesn't support regex either. If we're gonna preprocess the query pipeline and then use sqlalchemy-filters on top of that, might as well drop sqlalchemy-filters altogether and filter directly.

@mathemancer
Copy link
Contributor

Fair enough. I'm also not convinced sqlalchemy-filters is saving much work at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready Ready for implementation type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants