Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to priorize exact matches over fuzzy ones #183

Open
yangm97 opened this issue Aug 9, 2021 · 13 comments
Open

Option to priorize exact matches over fuzzy ones #183

yangm97 opened this issue Aug 9, 2021 · 13 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@yangm97
Copy link

yangm97 commented Aug 9, 2021

No description provided.

@cantino
Copy link
Owner

cantino commented Aug 19, 2021

I assume you mean when MCFLY_FUZZY is enabled?

@yangm97
Copy link
Author

yangm97 commented Aug 21, 2021

yes

@cantino cantino added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Aug 23, 2021
@cantino
Copy link
Owner

cantino commented Aug 23, 2021

Seems reasonable.

@nedbat
Copy link

nedbat commented Sep 6, 2021

I like fuzzy matching so that I don't have to remember the exact punctuation in a file name. But I'm finding that using it completely removes the claimed benefits of McFly's intelligence:
image
The command I want is the one I ran 10 minutes ago, at position 6. The top choice isn't even the best fuzzy match for the words I've typed. Am I doing something wrong? I have these settings:

$ env | grep MCFLY
MCFLY_FUZZY=true
MCFLY_RESULTS=30
MCFLY_HISTORY_LIMIT=10000
MCFLY_SESSION_ID=UzHARz6EfjOqhxy8VIvT9Bcg
MCFLY_HISTORY=/var/folders/10/4sn2sk3j2mg5m116f08_367m0000gq/T/mcfly.XXXXXXXX.EbZr6lxT

@cantino
Copy link
Owner

cantino commented Sep 6, 2021

I don't personally use fuzzy matching because I agree that it's of lower quality.

@nedbat
Copy link

nedbat commented Sep 6, 2021

Is there some way to improve it? It seems a shame to offer a setting which seems to negate the primary claim of the tool (intelligent history).

@cantino
Copy link
Owner

cantino commented Sep 11, 2021

I'd be open to contributions that improve it. It was contributed by a user and isn't a feature I use myself. I prefer the non-fuzzy matching for how I tend to use mcfly.

@dmfay
Copy link
Contributor

dmfay commented Oct 27, 2021

@nedbat matches are weighted by length, per #103 (comment)

Having been using it for a while myself I agree the balance could stand to shift further towards shorter matches. Easiest tweak is to add a FUZZY_FACTOR to that weighting algorithm -- even better if it's configurable so many people can try out different factors and speed up the process of converging on a generally useful default.

@dmfay
Copy link
Contributor

dmfay commented Nov 6, 2021

With 0.5.10 just out the fuzzy experience should be dramatically improved. If you have MCFLY_FUZZY=true you'll start with a "fuzzy factor" of 2, or you can set the environment variable to another integer value. Higher values of MCFLY_FUZZY favor shorter and earlier matches; 0 turns it off.

In my testing a fuzzy factor of 1 didn't do quite enough to prioritize what I was searching for, and 10+ weighed brevity and start position too heavily over the built-in rank. As I mentioned in the readme I expect the best results to be in the 2-5 range, but I also only have my own history to test with. If you have the time to try a few different settings please report how it works out for you and what MCFLY_FUZZY value you settle on!

@alfonz19
Copy link

Maybe I'm not getting it, but for me fuzzy search does not work, at least as I'd expect it to work. Having set export MCFLY_FUZZY=2 (same for 1), I press ctrl-r in bash and type "" and it finds something but first command in list does not contain at all. Shouldn't it contain it?

I used fzf command for searching history file before, and I like their syntax and it might work here as well? Space is delimiter and each entered word has to be present in line, with possibility to negate it using !word, your searches can be made case sensitive/insensitive, etc.

@dmfay
Copy link
Contributor

dmfay commented Aug 30, 2022

I thought it might have something to do with quotes, but no, that seems to work:

1661864453

double-check env | grep -i mcfly ?

@alfonz19
Copy link

double checked:
image

It was just my bad expectation.

explanation:
If i have line: "Pretty horse finished last", then following words will match: "pest", "p e st" or "p e st". So it probably means when fuzzy search is on, we can think of entered search string as regular expression, where there is ".*" automatically added after each char. It's fine, yet I'd say that this blind fishing is less beneficial than interpreting it like: there is some command with 'pest' word in it. Sure, in trivial stuff it does not matter. And if I'm searching for 2 words, I can use search string word1%word2, but in that case I have to know the correct order of words in command. I'd say, in case of long commands with lots of options, the longer the command is, the less useful single arbitrary character match will be useful, the less problematic will be remembering the correct order or words to search for. In this specific case it would be more beneficial to interpret entered search pattern: "word1 word2 word3" as: "search for lines containing all these 3 words in any order." Maybe this is already covered, sorry about this comment in that case, I just thought it is done via this option.

@dmfay
Copy link
Contributor

dmfay commented Aug 30, 2022

So it probably means when fuzzy search is on, we can think of entered search string as regular expression, where there is ".*" automatically added after each char.

not quite: it also prioritizes shorter and earlier matches (higher MCFLY_FUZZY numbers make this more important relative to the other ranking criteria). Your example search pest runs the length of the entire string because the only t after an s is in the word "last" and so it doesn't do very well in match length.

Borrowing your example further, pretty will match very well, and prt very slightly better as it's shorter (but it's more ambiguous, and you might see other results you aren't looking for); hrse will also do alright. finished/fnshd isn't as good a search because it's both later and "wider" -- preto will be much more effective than that, having the same width of 8 and a start position of 0 instead of 14.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants