Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URI and text filters #1079

Merged
merged 27 commits into from
Feb 28, 2022
Merged

Add URI and text filters #1079

merged 27 commits into from
Feb 28, 2022

Conversation

dmos62
Copy link
Contributor

@dmos62 dmos62 commented Feb 17, 2022

Fixes #413, Fixes #406

Implements following filters:

  • x contains y,
  • x contains (case insensitive) y,
  • x starts with (case insensitive) y,
  • x URI authority contains y,
  • x URI scheme equals y.

Other filters required in #413 and #406 have been implemented in previous PRs.

Technical details

The new filters look like this on the filters endpoint:

    {
        "id": "contains",
        "name": "contains",
        "parameters": [
            {
                "ui_types": [
                    "text",
                    "uri"
                ]
            },
            {
                "ui_types": [
                    "text",
                    "uri"
                ]
            }
        ]
    },
    {
        "id": "starts_with_case_insensitive",
        "name": "starts with (case insensitive)",
        "parameters": [
            {
                "ui_types": [
                    "text",
                    "uri"
                ]
            },
            {
                "ui_types": [
                    "text",
                    "uri"
                ]
            }
        ]
    },
    {
        "id": "uri_authority_contains",
        "name": "URI authority contains",
        "parameters": [
            {
                "ui_types": [
                    "uri"
                ]
            },
            {
                "ui_types": [
                    "text",
                    "uri"
                ]
            }
        ]
    },
    {
        "id": "uri_scheme_equals",
        "name": "URI scheme equals",
        "parameters": [
            {
                "ui_types": [
                    "uri"
                ]
            },
            {
                "ui_types": [
                    "text",
                    "uri"
                ]
            }
        ]
    }

Notice that the Mathesar type uri is string-like, just like text, so it's allowed wherever text is allowed (see contains parameters' types for an example). In contrast, uri_authority_contains first parameter can only be uri. Same for uri_scheme_equals.

Tests for new filters have been added.

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the master branch of the repository
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no
    visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@dmos62 dmos62 added affects: architecture Improvements or additions to architecture work: backend Related to Python, Django, and simple SQL status: draft labels Feb 17, 2022
@dmos62 dmos62 added this to the [07] Initial Data Types milestone Feb 17, 2022
@dmos62 dmos62 self-assigned this Feb 17, 2022
@kgodey kgodey changed the base branch from master to replace-filtering-api February 17, 2022 17:37
Base automatically changed from replace-filtering-api to master February 17, 2022 18:01
@dmos62 dmos62 marked this pull request as ready for review February 18, 2022 16:29
@dmos62 dmos62 requested a review from a team February 18, 2022 16:29
@dmos62
Copy link
Contributor Author

dmos62 commented Feb 21, 2022

Note that in this PR the filters endpoint response includes some filters multiple times. This is fixed in #1090.

@codecov-commenter
Copy link

codecov-commenter commented Feb 21, 2022

Codecov Report

Merging #1079 (cb9a118) into master (02e22cb) will increase coverage by 0.13%.
The diff coverage is 97.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1079      +/-   ##
==========================================
+ Coverage   93.24%   93.37%   +0.13%     
==========================================
  Files         112      112              
  Lines        4084     4168      +84     
==========================================
+ Hits         3808     3892      +84     
  Misses        276      276              
Flag Coverage Δ
pytest-backend 93.37% <97.29%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
db/functions/operations/check_support.py 100.00% <ø> (ø)
db/functions/redundant.py 92.30% <89.47%> (+4.80%) ⬆️
db/types/base.py 97.22% <91.66%> (-2.78%) ⬇️
db/functions/base.py 93.63% <100.00%> (+2.86%) ⬆️
db/functions/known_db_functions.py 100.00% <100.00%> (ø)
db/functions/operations/apply.py 97.05% <100.00%> (+0.76%) ⬆️
db/types/uri.py 100.00% <100.00%> (+1.78%) ⬆️
mathesar/database/types.py 97.61% <100.00%> (ø)
mathesar/models.py 96.35% <100.00%> (+0.04%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 02e22cb...cb9a118. Read the comment docs.

Copy link
Contributor

@kgodey kgodey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of changes requested in individual comments below. I'd also like @mathemancer to take a look at this before merge since it touches the URI type.

Notice that in the issue #413, the x URI scheme equals y filter is specified to allow the user to choose from a list of URI schemes. This PR does not implement that.

At the same time, I think that scheme suggestions is not an important feature right now, so I propose we create an issue for it and put it off.

I think we should do this, it is important for making the product friendly to non-technical users, who may not know what a URI scheme is.

It could be done somewhat easily. Just add to the relevant parameter a hint like hint.suggested_values(("http","https","ftp","ftps",...)).

This would require putting some and-or logic into the hint system though, which will be necessary down the road, but maybe not so much now.

I'm not sure why this would need "and-or logic", can you elaborate?

# would involve creating an alternative to to_sa_expression: something like to_db_function
# execution engine would see that to_sa_expression is not implemented, and it would look for
# to_db_function.
class RedundantDBFunction(DBFunction):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this name confusing. Perhaps something like DBFunctionCombination would be clearer?

Copy link
Contributor Author

@dmos62 dmos62 Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think DBFunctionCombination can be even more confusing, since a regular DBFunction instance can contain other DBFunction instances as parameters, and thus could be called a combination or supporting combination.

I like "redundant", because it highlights that it's made of DBFunctions that could already be combined together without this particular DBFunction.

I'm open to other names. Maybe SecondaryDBFunction? That's a bit abstract though. DBFunctionPackage has the same problem as DBFunctionCombination, but it goes hand in hand with the abstract unpack method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like DBFunctionPackage.


class StartsWithCaseInsensitive(RedundantDBFunction):
id = 'starts_with_case_insensitive'
name = 'starts with (case insensitive)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this name is meant to be used in the frontend, please remove (case insensitive) from the name parameter.

Copy link
Contributor Author

@dmos62 dmos62 Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. If the suffix is removed, it will have the same display name as a regular (case-sensitive) starts with filter: a user won't be able to distinguish the two. Why do you want to make this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be a case-sensitive "starts with" in the frontend, only the case insensitive version.

We don't want to overwhelm users with a lot of similar filtering options, it will make finding the filter they want harder. And non-technical users may not know what "case sensitive" refers to.

@kgodey kgodey assigned mathemancer and unassigned kgodey Feb 23, 2022

class ContainsCaseInsensitive(RedundantDBFunction):
id = 'contains_case_insensitive'
name = 'contains (case insensitive)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case insensitive comment from above applies here as well.

@dmos62
Copy link
Contributor Author

dmos62 commented Feb 23, 2022

It could be done somewhat easily. Just add to the relevant parameter a hint like hint.suggested_values(("http","https","ftp","ftps",...)).

This would require putting some and-or logic into the hint system though, which will be necessary down the road, but maybe not so much now.

I'm not sure why this would need "and-or logic", can you elaborate?

I wasn't clear enough. These are the options I mentioned:

  1. It could be done somewhat easily. Just add to the relevant parameter a hint like hint.suggested_values(("http","https","ftp","ftps",...)).

  2. Or, give the parameter a hint specifying that it must be either just a string-like or a uri-scheme: then the frontend would know what kinds of suggestions would be appropriate there. This would require putting some and-or logic into the hint system though, which will be necessary down the road, but maybe not so much now.

If this is wanted for alpha, I'll hack something together into a dedicated PR.

@dmos62
Copy link
Contributor Author

dmos62 commented Feb 23, 2022

@kgodey @mathemancer note that the changes in the db/types/uri.py are just me adding URI-specific DBFunctions to that namespace. I don't make changes to the existing URI type code.

@kgodey
Copy link
Contributor

kgodey commented Feb 23, 2022

If this is wanted for alpha, I'll hack something together into a dedicated PR.

Okay, thanks, please do. Option 1 seems easiest.

@dmos62
Copy link
Contributor Author

dmos62 commented Feb 24, 2022

Ready for review.

I've removed case-sensitive string filters from the filters endpoint.

Other requested changes will be submitted as dedicated PR/s. Those changes are:

@dmos62 dmos62 requested a review from kgodey February 24, 2022 12:37
Copy link
Contributor

@kgodey kgodey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I'll leave it to @mathemancer to review/merge.

Copy link
Contributor

@mathemancer mathemancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly just had questions; the only real request is to try to unify the definitions of things like "string-like" "number-like", etc. between the definitions in the db.types.operations.cast module and the hint building logic.

Comment on lines +23 to +24
def sa_call_sql_function(function_name, *parameters):
return getattr(func, function_name)(*parameters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome idea.

@@ -225,6 +230,36 @@ def to_sa_expression(*values):
class StartsWith(DBFunction):
id = 'starts_with'
name = 'starts with'
hints = tuple([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why you went with this syntax rather than just

hints = (
    hints.foo(bar),
    hings.baz,
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the shorthand tuple syntax ((x,)) to be errorprone. Just yesterday I ran into an annoying bug where I had forgotten the trailing comma and what was supposed to be a single-element tuple ended up being just the element. So I try to do the more awkward, but safer tuple([x]). This is not a very strong preference. After getting bit by this a few times you learn to append a comma to everything, I guess.

@@ -73,7 +77,7 @@ class Literal(DBFunction):
name = 'as literal'
hints = tuple([
hints.parameter_count(1),
hints.parameter(1, hints.literal),
hints.parameter(0, hints.literal),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? (and others like it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to use 1-based indexing, but changed my mind. Everything in Python is 0-based, so I decided to go with the flow.

raise Exception("UnpackabelDBFunction.to_sa_expression should never be used.")

@abstractmethod
def unpack(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own education: Should implementations of this unpack method recurse "in place"? I.e., if I happen to have a redundant (or whatever we decide for a term) function that's composed of other redundant functions, should the unpack method call the unpack methods of those functions as well, or should that be left to the caller? I'm not sure since you'd want to avoid having the recursion in multiple places, and _db_function_to_sa_expression is already recursive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's desirable not to have to do manual recursion inside unpack. I don't immediately recall if I implemented full recursion yet. I think I did, since I was concerned about infinite loops.

Comment on lines +155 to +159
string_like_db_types = (
PostgresType.CHARACTER_VARYING,
PostgresType.CHARACTER,
PostgresType.TEXT,
MathesarCustomType.URI,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of info is also in the cast.py file. Did you consider trying to unify these with those definitions? I ask because it makes me nervous to have this defined in more than one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice the duplication. I'll investigate.

@dmos62
Copy link
Contributor Author

dmos62 commented Feb 25, 2022

@mathemancer could we get this merged? It's the base for a few other PRs. I'll open an issue for the duplicated logic across db.types.operations.cast and db.types.base modules and see about fixing that in a new PR.

@mathemancer mathemancer self-requested a review February 28, 2022 17:43
Copy link
Contributor

@mathemancer mathemancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, since I agree we should probably get this merged to unblock downstream PRs. Please don't forget to create the issue about the duplicated definitions of type tags (string-like, number-like, etc.)

@mathemancer mathemancer merged commit 67c6439 into master Feb 28, 2022
@mathemancer mathemancer deleted the add-uri-filters branch February 28, 2022 17:46
@dmos62
Copy link
Contributor Author

dmos62 commented Feb 28, 2022

@mathemancer the issue for that is here: #1100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects: architecture Improvements or additions to architecture pr-status: review A PR awaiting review work: backend Related to Python, Django, and simple SQL
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Implement filtering options for the URI type Implement filtering options for Text types.
4 participants