Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make certain search (only) fields case insensitive #322

Open
nickdos opened this issue Feb 5, 2019 · 4 comments
Open

Make certain search (only) fields case insensitive #322

nickdos opened this issue Feb 5, 2019 · 4 comments
Assignees
Labels

Comments

@nickdos
Copy link
Contributor

nickdos commented Feb 5, 2019

See also #76.

I think there is a case to at least create a copyTo field version of the taxon_name field. The problem is this field is currently case sensitive, so the user has to know the case of the indexed value in order to search for it. E.g. Acacia dealbata works but acacia dealbata returns nothing. Even worse is the search for ANIMALIA - you have to search for it in all caps to find records for that name.

This is a problem in the batch taxon search form where we allow users to search with multiple names. It works fine for raw_name (insensitive version of raw_taxon_name I think), which is case insensitive but not for taxon_name (enhancement I'm working on).

@djtfmartin suggested in #76 that we use the plain q=acacia dealbata query but this fails for terms like acacia and animalia because it searches in other fields (is replaced by text:foo), such as the various remarks fields and therefore blows out the record count because it brings back records that only mention those terms in those other fields.

There are probably a few other fields where we want to do this as well - will update if I think of them.

@ansell
Copy link
Contributor

ansell commented Feb 5, 2019

If taxon_name:ANIMALIA is only matching against raw_taxon_name, the result is expected, but still not what we want to happen ideally. Almost noone ever submits records that include ANIMALIA.

@nickdos
Copy link
Contributor Author

nickdos commented Feb 5, 2019

@ansell taxon_name matches the processed/matched name. You're thinking of raw_name which is a textgen version of raw_taxon_name (String type). This issue is about creating a similar text type field for taxon_name that can be used for humans to search for accepted names.

@nickdos
Copy link
Contributor Author

nickdos commented Feb 5, 2019

The alternative to creating a case-insensitive version of taxon_name is to have a pseudo-field that takes the input name/s and does a lookup against the name_matching_index, which then matches to a GUID and then searches the index with the GUID. I thought we had something like this already but I couldn't work out what it was. @adam-collins or @djtfmartin does this ring a bell?

Edit: biocache-hubs/ala-hub does this but I'm talking about on the biocache-service side only.

Edit: Adam advise the taxa field does work on the service side as well. Its not listed in /index/fields which is why I was not aware of it. I think this should solve the immediate issue at hand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants