atom / fuzzaldrin Goto Github PK

View Code? Open in Web Editor NEW

319.0 21.0 28.0 470 KB

Fuzzy filtering and string scoring

License: MIT License

CoffeeScript 100.00%

fuzzaldrin's Introduction

Atom and all repositories under Atom will be archived on December 15, 2022. Learn more in our official announcement

fuzzaldrin

Fuzzy filtering and string scoring.

This library is used by Atom and so its focus will be on scoring and filtering paths, methods, and other things common when writing code. It therefore will specialize in handling common patterns in these types of strings such as characters like /, -, and _, and also handling of camel cased text.

Using

npm install fuzzaldrin

filter(candidates, query, [options])

Sort and filter the given candidates by matching them against the given query.

candidates - An array of strings or objects.
query - A string query to match each candidate against.
options - An optional object with the following keys:
- key - The property to use for scoring if the candidates are objects.
- maxResults - The maximum numbers of results to return.

Returns an array of candidates sorted by best match against the query.

{filter} = require 'fuzzaldrin'

# With an array of strings
candidates = ['Call', 'Me', 'Maybe']
results = filter(candidates, 'me')
console.log(results) # ['Me', 'Maybe']

# With an array of objects
candidates = [
  {name: 'Call', id: 1}
  {name: 'Me', id: 2}
  {name: 'Maybe', id: 3}
]
results = filter(candidates, 'me', key: 'name')
console.log(results) # [{name: 'Me', id: 2}, {name: 'Maybe', id: 3}]

score(string, query)

Score the given string against the given query.

string - The string the score.
query - The query to score the string against.

{score} = require 'fuzzaldrin'

score('Me', 'me')    # 0.17099999999999999
score('Maybe', 'me') # 0.0693

Developing

git clone https://github.com/atom/fuzzaldrin.git
cd fuzzaldrin
npm install
npm test

You can run the benchmarks using:

npm run benchmark

fuzzaldrin's People

Contributors

Stargazers

Watchers

fuzzaldrin's Issues

Wrongly ordered candidates with exact file name match

var candidates, filter, results;

filter = require('fuzzaldrin').filter;

candidates = ['test/components/core/application/applicationPageStateServiceSpec.js', 'test/components/core/view/components/actions/actionsServiceSpec.js'];

results = filter(candidates, 'actionsServiceSpec.js');

console.log(results);

This is what I get

[ 'test/components/core/application/applicationPageStateServiceSpec.js',
  'test/components/core/view/components/actions/actionsServiceSpec.js' ]

This is what I expect

[ 'test/components/core/view/components/actions/actionsServiceSpec.js',
  'test/components/core/application/applicationPageStateServiceSpec.js' ]

as actionsServiceSpec.js is exact file name match

Use LCS to Rank Matches

I suggest using LCS to rank matches:
https://en.wikipedia.org/wiki/Longest_common_subsequence_problem#Solution_for_two_sequences

Matches with lower number of substrings would come before matches with higher number of substrings.

This would be a lot more generic than the substring() based approach suggested in some PRs.

If you want to play with this, here's an online thing demonstrating the concept (I didn't write this, just found it):
http://lcs-demo.sourceforge.net/

Start by upping the "Max Size" a bit, then type in two strings and press "Execute LCS Lengths" to see how it works.

One real-world use case where this would come in handy is when searching for "git push", note how it's third place even though "git" and "push" are exact matches. Single-word matches would be caught by this as well.

Doesn't match spaces to underscores

When searching in a project with character.rb and a character_class.rb file (note the underscore), I can find the character.rb fine by simply typing char.

But finding the character_class.rb file isn't quite as easy. My first attempt was to type in char class, expecting the fuzzy finder to work something like sublime text's does. Alas, that yields no results.

But if I change out the space between the char and class with an underscore, the file is located just fine.

I would have expected the finder to map spaces to special characters. Is this intentional, and if so, why? or is it an oversight?

Best regards

Consider more advanced string scoring

Here is a simple test taken from reverse-engineering the Sublime Text matching:

expect(score('foobar/138_abc_zyx', 'az')).toBeLessThan(score('foobar/lolololol/abc_zyx', 'az'))

The point here is very logical. Even if the abbreviation is matched later in the whole string, it would have a bigger score if the abbreviation is matched earlier in the basename.

Unfortunately stringScore() would not pass this test.

Scores for az:

	Sublime	Atom
foobar/138_abc_zyx	145	0.11111111111111112
foobar/lolololol/abc_zyx	151	0.10833333333333334

You could create a PR with the test here if needed: https://github.com/hkdobrev/fuzzaldrin/compare/atom:master...hkdobrev:failing-test-for-basename-scoring

I hope this issue is one of many I would be able to find and define from Sublime Text implementation. The Go To File feature is a killer. It really surprises you how well it's implemented even after months and months of usage. Atom should really make enhancing the Fuzzy Finder a key priority.

chore: add `match` documentation

Hi,

match usage is not described anywhere else but in the code. It could be nice to give the function signature as long as an example (most common case would be "how to surround with <strong/> tag")

extractor option

Instead of the key option, there should be an option to provide a function that takes in the candidate and returns a string.

Improve fuzzy finder (long term solution)

We have a quite nice PR #22 which is improving the already existing state of the fuzzy finder.
But it doesn't feel as pertinent and snappy as the Chrome dev tools or Sublime Text one.
Maybe a good starting point for the next step after this PR would be to try to implement the same behaviour as the Chrome dev tools one.
Thankfully it's open source so we can check the code here:
https://chromium.googlesource.com/chromium/blink/+/master/Source/devtools/front_end/sources

Especially in those files:

FilePathScoreFunction.js

FilteredItemSelectionDialog.js

Any suggestion / objections?

Exact file name matches are scored below partial matches.

atom/fuzzy-finder#46

Exact file name match with case matching is below another match : http://i.imgur.com/kmJYY29.png

If case is not matching it goes even lower on the list : http://i.imgur.com/PbpCb8y.png

Directory depth penalty

I understand the scoring algorithm uses some sort of penalty to a file if it was too deep inside the directory structure, but sometimes this doesn't make sense, I'll post an example and explain:

As you can see, it now matches the probably expected file, but when we want to filter by saying "it's in the controllers/ directory"

It penalizes the application_controller result because directory depth, or at least that's my guess. So why not start the deepness count of at the first directory match, in this case in the controllers/ directory.

[performance] Convert to asynchronous API so that palette rendering is decoupled from search completion algorithm.

See atom/atom#12014 (comment)

Cannot find module 'fuzzaldrin'

Hi,

I'm trying to write a custom provider to use with autocomplete+ and I'm having problems with the fuzzaldrin module. I'm requiring it in my provider.coffee like this:

fs = require 'fs'
path = require 'path'
fuzzaldrin = require 'fuzzaldrin'

module.exports =
    selector: '.source.lua'
    disableForSelector: '.source.lua .comment, .source.lua .string'
...

And in my package.json I have this:

  "dependencies": {
    "fuzzaldrin": "^2.1.0"
  }

But still I get this when I load with the package:

At Cannot find module 'fuzzaldrin'

Error: Cannot find module 'fuzzaldrin'
    at Module._resolveFilename (module.js:336:15)
    at Function.exports.register.Module._resolveFilename (/Applications/Atom.app/Contents/Resources/app.asar/src/module-cache.js:383:52)
    at Function.Module._load (module.js:286:25)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-provider.coffee:3:14)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-provider.coffee:1:1)
    at Module._compile (module.js:434:26)
    at Object.keys.forEach.Object.defineProperty.value [as .coffee] (/Applications/Atom.app/Contents/Resources/app.asar/src/compile-cache.js:190:21)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-atom.coffee:2:16)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-atom.coffee:2:1)
    at Module._compile (module.js:434:26)
    at Object.keys.forEach.Object.defineProperty.value [as .coffee] (/Applications/Atom.app/Contents/Resources/app.asar/src/compile-cache.js:190:21)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Package.module.exports.Package.requireMainModule (/Applications/Atom.app/Contents/Resources/app.asar/src/package.js:661:34)
    at /Applications/Atom.app/Contents/Resources/app.asar/src/package.js:115:28
    at Package.module.exports.Package.measure (/Applications/Atom.app/Contents/Resources/app.asar/src/package.js:92:15)
    at Package.module.exports.Package.load (/Applications/Atom.app/Contents/Resources/app.asar/src/package.js:106:12)
    at PackageManager.module.exports.PackageManager.loadPackage (/Applications/Atom.app/Contents/Resources/app.asar/src/package-manager.js:434:14)
    at PackageManager.module.exports.PackageManager.loadPackages (/Applications/Atom.app/Contents/Resources/app.asar/src/package-manager.js:386:14)
    at AtomEnvironment.module.exports.AtomEnvironment.startEditorWindow (/Applications/Atom.app/Contents/Resources/app.asar/src/atom-environment.js:667:21)
    at Object.<anonymous> (/Applications/Atom.app/Contents/Resources/app.asar/src/initialize-application-window.js:38:8)
    at Object.<anonymous> (/Applications/Atom.app/Contents/Resources/app.asar/src/initialize-application-window.js:49:4)
    at Module._compile (module.js:434:26)
    at Object.keys.forEach.Object.defineProperty.value [as .js] (/Applications/Atom.app/Contents/Resources/app.asar/src/compile-cache.js:190:21)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at setupWindow (file:///Applications/Atom.app/Contents/Resources/app.asar/static/index.js:79:5)
    at window.onload (file:///Applications/Atom.app/Contents/Resources/app.asar/static/index.js:35:9)

The strange thing is that I have looked at some downloaded packages which are using fuzzaldrin and I can't see anything they are doing differently. Any hints would be much appreciated 👍

Score Capitol Letters Higher in Fuzzy Search

It seems that Sublime and possibly the chrome dev tools give a higher priority to capitol letters when searching. This makes finding the file MyFileName.ext much easier since you can type MFN and not be bothered with having to spell it out. I think this is a great productivity feature since you can type less letters which introduces less spelling errors.

For instance, assume a list of files that would represent a front end javascript usecase.

var files = ['FilterFactors.js', 
'FilterFactors.styl', 
'FilterFactors.html', 
'FilterFactorTests.html', 
'SpecFilterFactors.js'];

In sublime, I could type FFT to get to FilterFactorTests.html. In atom, FFT matches FilterFactors .js, .styl, and .html above FilterFactorTests.html.

Question: Is this package already used in Atom?

I don't know if I should report situations where Atom could do better with sorting :)

If no, is there a way how to "turn it on"?

Improve the fuzzy results

After a few minutes of fuzzy finding in a project of my i noticed, that often the file i wanted to find was listed often lower than 4-6th place. similar files and searches
end up around 1-3rd place most of the time in SublimeText.
I just skimmed over the code. you already seem to use a score algorithm similar to (https://github.com/joshaven/string_score). I know that PeepOpen used to work quite well as is opensource now. Maybe that'll help improve this? :)

https://github.com/topfunky/PeepOpen
http://code.tutsplus.com/tutorials/vim-essential-plugin-peepopen--net-19824

FuzzyMatching
https://github.com/topfunky/PeepOpen/blob/master/Classes/Models/FuzzyRecord.rb#L268-L427

Pre-select last used result for input

Coming from Sublime Text to Atom, a major stumble for me is, from what I gather, that fuzzaldrin never learns from my usage and returns only results sorted by their match score. However, Sublime both in command palette and autocomplete menu would pre-select the last used result I selected the same input.

I'll illustrate the need with two examples.

First up, autocomplete. When writing CSS, I previously would use sort of a shorthand, say, bcotab for background-color. When I use the shortcut, the editor should remember the option I picked and the next time I use the exact same input (bco) it should pre-select the exact same result (background-color).

This is a massive time-saver that lets me have pseudo-snippets of sorts, where I would write CSS properties just by 1-to-3 letter combinations, with results coming straight from autocomplete's memory of my last usage. Fast, easy to remember and, best of all, simple.

Second, command palette. There are a couple of commands that aren't used constantly (and I don't want keyboard shortcuts for such commands) but come in handy once or twice a month, like converting the current ~~file~~ buffer from spaces to tabs or the other way around.

To do this, I'd open the command palette, type in "tabs" or "spaces" and choose the appropriate command from the list. In Sublime, I'd already have preselected the command I chose the last time I typed the same input: tabs would pre-select "Indentation: Convert to Tabs" and spaces would pre-select "Indentation: Convert to Spaces". And even though there are other results for those queries, and some of them may be higher-scoring (as indicated by the order of items), the default command when I hit Return is the one I actually use out of those results.

Instead, in Atom when I type tabs into the command palette, I have to scroll or arrow-down to 10 items lower to get the one I wanted, it's ridiculous.

Right now, Atom doesn't do this. Instead every time I run an often-used autocomplete input or search for a command in the command palette, I get the same ordered-by-score results and have to select the one that I want, which in most cases would have been the same one as always. Sublime does this and it's a huge time saver, I'd love to see Atom add this functionality.

And yes, I could add obscure keyboard shortcuts and manually type out snippets for every single shortcut, but that wouldn't be very productive or flexible.

Match path after filename

If there are multiple files with the same filename in a project, typing the filename presents them all but you cannot enter more text to narrow them down by path. For example:

search/README.md
database/README.md

If I enter "README" I'll get both files but cannot enter "search" or "database" to select from them. If I enter "search READ" it will narrow it to a single file, but I don't always know I'll have multiple matches until the filenames have been presented. (Think "index.js" or "index.coffee" in a large project.)

By switching the search order, I could enter "README", see there are multiple matches, and start typing "search" to narrow it down to the one I wanted.

Can I disable sorting?

Is it possible to disable the sorting of results?

I have a use for fuzzaldrin in which the sorting is not beneficial, but the matching is. Currently I run a set of sorted candidates through the filter and then have to resort them -- not horrible, but still an extra step that would be nice not to do.

I do use both filtering and sorting in other cases. I'd like to use the same fuzzysearch library across all filtering instances.

Consider \ and / interchangable?

More related to PHP specifically, but it would allow namespaces such as:

JMS\Serializer\Annotation

to look for

JMS/Serializer/Annotation

Since 99% of the time, filenames wont have contain " \ " , it would generally improve the results.

Favour case sensitive matches

OmniSharp/omnisharp-atom#380

So in this screenshot, I'd like the local variable diagnostics to be the first item in the list here as it is the first match that has a prefix with identical case.

Is this something that you might consider adding (maybe via an option) ?