Coder Social home page Coder Social logo

fuzzaldrin's Introduction

Atom and all repositories under Atom will be archived on December 15, 2022. Learn more in our official announcement

fuzzaldrin

Build Status Build status

Fuzzy filtering and string scoring.

This library is used by Atom and so its focus will be on scoring and filtering paths, methods, and other things common when writing code. It therefore will specialize in handling common patterns in these types of strings such as characters like /, -, and _, and also handling of camel cased text.

Using

npm install fuzzaldrin

filter(candidates, query, [options])

Sort and filter the given candidates by matching them against the given query.

  • candidates - An array of strings or objects.
  • query - A string query to match each candidate against.
  • options - An optional object with the following keys:
    • key - The property to use for scoring if the candidates are objects.
    • maxResults - The maximum numbers of results to return.

Returns an array of candidates sorted by best match against the query.

{filter} = require 'fuzzaldrin'

# With an array of strings
candidates = ['Call', 'Me', 'Maybe']
results = filter(candidates, 'me')
console.log(results) # ['Me', 'Maybe']

# With an array of objects
candidates = [
  {name: 'Call', id: 1}
  {name: 'Me', id: 2}
  {name: 'Maybe', id: 3}
]
results = filter(candidates, 'me', key: 'name')
console.log(results) # [{name: 'Me', id: 2}, {name: 'Maybe', id: 3}]

score(string, query)

Score the given string against the given query.

  • string - The string the score.
  • query - The query to score the string against.
{score} = require 'fuzzaldrin'

score('Me', 'me')    # 0.17099999999999999
score('Maybe', 'me') # 0.0693

Developing

git clone https://github.com/atom/fuzzaldrin.git
cd fuzzaldrin
npm install
npm test

You can run the benchmarks using:

npm run benchmark

fuzzaldrin's People

Contributors

amccloud avatar darangi avatar jmacdonald avatar kevinsawicki avatar mnquintana avatar philschatz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuzzaldrin's Issues

Wrongly ordered candidates with exact file name match

var candidates, filter, results;

filter = require('fuzzaldrin').filter;

candidates = ['test/components/core/application/applicationPageStateServiceSpec.js', 'test/components/core/view/components/actions/actionsServiceSpec.js'];

results = filter(candidates, 'actionsServiceSpec.js');

console.log(results);

This is what I get

[ 'test/components/core/application/applicationPageStateServiceSpec.js',
  'test/components/core/view/components/actions/actionsServiceSpec.js' ]

This is what I expect

[ 'test/components/core/view/components/actions/actionsServiceSpec.js',
  'test/components/core/application/applicationPageStateServiceSpec.js' ]

as actionsServiceSpec.js is exact file name match

Use LCS to Rank Matches

I suggest using LCS to rank matches:
https://en.wikipedia.org/wiki/Longest_common_subsequence_problem#Solution_for_two_sequences

Matches with lower number of substrings would come before matches with higher number of substrings.

This would be a lot more generic than the substring() based approach suggested in some PRs.

If you want to play with this, here's an online thing demonstrating the concept (I didn't write this, just found it):
http://lcs-demo.sourceforge.net/

Start by upping the "Max Size" a bit, then type in two strings and press "Execute LCS Lengths" to see how it works.

One real-world use case where this would come in handy is when searching for "git push", note how it's third place even though "git" and "push" are exact matches. Single-word matches would be caught by this as well.
gitpush

Doesn't match spaces to underscores

When searching in a project with character.rb and a character_class.rb file (note the underscore), I can find the character.rb fine by simply typing char.

screen shot 2014-02-27 at 19 15 34

But finding the character_class.rb file isn't quite as easy. My first attempt was to type in char class, expecting the fuzzy finder to work something like sublime text's does. Alas, that yields no results.

screen shot 2014-02-27 at 19 17 41

But if I change out the space between the char and class with an underscore, the file is located just fine.

screen shot 2014-02-27 at 19 18 34

I would have expected the finder to map spaces to special characters. Is this intentional, and if so, why? or is it an oversight?

Best regards

Consider more advanced string scoring

Here is a simple test taken from reverse-engineering the Sublime Text matching:

expect(score('foobar/138_abc_zyx', 'az')).toBeLessThan(score('foobar/lolololol/abc_zyx', 'az'))

The point here is very logical. Even if the abbreviation is matched later in the whole string, it would have a bigger score if the abbreviation is matched earlier in the basename.

Unfortunately stringScore() would not pass this test.

Scores for az:

Sublime Atom
foobar/138_abc_zyx 145 0.11111111111111112
foobar/lolololol/abc_zyx 151 0.10833333333333334

You could create a PR with the test here if needed: https://github.com/hkdobrev/fuzzaldrin/compare/atom:master...hkdobrev:failing-test-for-basename-scoring


I hope this issue is one of many I would be able to find and define from Sublime Text implementation. The Go To File feature is a killer. It really surprises you how well it's implemented even after months and months of usage. Atom should really make enhancing the Fuzzy Finder a key priority.

chore: add `match` documentation

Hi,

match usage is not described anywhere else but in the code. It could be nice to give the function signature as long as an example (most common case would be "how to surround with <strong/> tag")

extractor option

Instead of the key option, there should be an option to provide a function that takes in the candidate and returns a string.

Improve fuzzy finder (long term solution)

We have a quite nice PR #22 which is improving the already existing state of the fuzzy finder.
But it doesn't feel as pertinent and snappy as the Chrome dev tools or Sublime Text one.
Maybe a good starting point for the next step after this PR would be to try to implement the same behaviour as the Chrome dev tools one.
Thankfully it's open source so we can check the code here:
https://chromium.googlesource.com/chromium/blink/+/master/Source/devtools/front_end/sources

Especially in those files:

FilePathScoreFunction.js
FilteredItemSelectionDialog.js

Any suggestion / objections?

Directory depth penalty

I understand the scoring algorithm uses some sort of penalty to a file if it was too deep inside the directory structure, but sometimes this doesn't make sense, I'll post an example and explain:

image

As you can see, it now matches the probably expected file, but when we want to filter by saying "it's in the controllers/ directory"

screen shot 2015-11-02 at 5 08 01 pm

It penalizes the application_controller result because directory depth, or at least that's my guess. So why not start the deepness count of at the first directory match, in this case in the controllers/ directory.

Cannot find module 'fuzzaldrin'

Hi,

I'm trying to write a custom provider to use with autocomplete+ and I'm having problems with the fuzzaldrin module. I'm requiring it in my provider.coffee like this:

fs = require 'fs'
path = require 'path'
fuzzaldrin = require 'fuzzaldrin'

module.exports =
    selector: '.source.lua'
    disableForSelector: '.source.lua .comment, .source.lua .string'
...

And in my package.json I have this:

  "dependencies": {
    "fuzzaldrin": "^2.1.0"
  }

But still I get this when I load with the package:

At Cannot find module 'fuzzaldrin'

Error: Cannot find module 'fuzzaldrin'
    at Module._resolveFilename (module.js:336:15)
    at Function.exports.register.Module._resolveFilename (/Applications/Atom.app/Contents/Resources/app.asar/src/module-cache.js:383:52)
    at Function.Module._load (module.js:286:25)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-provider.coffee:3:14)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-provider.coffee:1:1)
    at Module._compile (module.js:434:26)
    at Object.keys.forEach.Object.defineProperty.value [as .coffee] (/Applications/Atom.app/Contents/Resources/app.asar/src/compile-cache.js:190:21)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-atom.coffee:2:16)
    at Object.<anonymous> (/Users/Robert/Coding/Atom/love-atom/lib/love-atom.coffee:2:1)
    at Module._compile (module.js:434:26)
    at Object.keys.forEach.Object.defineProperty.value [as .coffee] (/Applications/Atom.app/Contents/Resources/app.asar/src/compile-cache.js:190:21)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Package.module.exports.Package.requireMainModule (/Applications/Atom.app/Contents/Resources/app.asar/src/package.js:661:34)
    at /Applications/Atom.app/Contents/Resources/app.asar/src/package.js:115:28
    at Package.module.exports.Package.measure (/Applications/Atom.app/Contents/Resources/app.asar/src/package.js:92:15)
    at Package.module.exports.Package.load (/Applications/Atom.app/Contents/Resources/app.asar/src/package.js:106:12)
    at PackageManager.module.exports.PackageManager.loadPackage (/Applications/Atom.app/Contents/Resources/app.asar/src/package-manager.js:434:14)
    at PackageManager.module.exports.PackageManager.loadPackages (/Applications/Atom.app/Contents/Resources/app.asar/src/package-manager.js:386:14)
    at AtomEnvironment.module.exports.AtomEnvironment.startEditorWindow (/Applications/Atom.app/Contents/Resources/app.asar/src/atom-environment.js:667:21)
    at Object.<anonymous> (/Applications/Atom.app/Contents/Resources/app.asar/src/initialize-application-window.js:38:8)
    at Object.<anonymous> (/Applications/Atom.app/Contents/Resources/app.asar/src/initialize-application-window.js:49:4)
    at Module._compile (module.js:434:26)
    at Object.keys.forEach.Object.defineProperty.value [as .js] (/Applications/Atom.app/Contents/Resources/app.asar/src/compile-cache.js:190:21)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at setupWindow (file:///Applications/Atom.app/Contents/Resources/app.asar/static/index.js:79:5)
    at window.onload (file:///Applications/Atom.app/Contents/Resources/app.asar/static/index.js:35:9)

The strange thing is that I have looked at some downloaded packages which are using fuzzaldrin and I can't see anything they are doing differently. Any hints would be much appreciated ๐Ÿ‘

Score Capitol Letters Higher in Fuzzy Search

It seems that Sublime and possibly the chrome dev tools give a higher priority to capitol letters when searching. This makes finding the file MyFileName.ext much easier since you can type MFN and not be bothered with having to spell it out. I think this is a great productivity feature since you can type less letters which introduces less spelling errors.

For instance, assume a list of files that would represent a front end javascript usecase.

var files = ['FilterFactors.js', 
'FilterFactors.styl', 
'FilterFactors.html', 
'FilterFactorTests.html', 
'SpecFilterFactors.js'];

In sublime, I could type FFT to get to FilterFactorTests.html. In atom, FFT matches FilterFactors .js, .styl, and .html above FilterFactorTests.html.

Improve the fuzzy results

After a few minutes of fuzzy finding in a project of my i noticed, that often the file i wanted to find was listed often lower than 4-6th place. similar files and searches
end up around 1-3rd place most of the time in SublimeText.
I just skimmed over the code. you already seem to use a score algorithm similar to (https://github.com/joshaven/string_score). I know that PeepOpen used to work quite well as is opensource now. Maybe that'll help improve this? :)

https://github.com/topfunky/PeepOpen
http://code.tutsplus.com/tutorials/vim-essential-plugin-peepopen--net-19824

FuzzyMatching
https://github.com/topfunky/PeepOpen/blob/master/Classes/Models/FuzzyRecord.rb#L268-L427

Pre-select last used result for input

Coming from Sublime Text to Atom, a major stumble for me is, from what I gather, that fuzzaldrin never learns from my usage and returns only results sorted by their match score. However, Sublime both in command palette and autocomplete menu would pre-select the last used result I selected the same input.

I'll illustrate the need with two examples.


First up, autocomplete. When writing CSS, I previously would use sort of a shorthand, say, bcotab for background-color. When I use the shortcut, the editor should remember the option I picked and the next time I use the exact same input (bco) it should pre-select the exact same result (background-color).

This is a massive time-saver that lets me have pseudo-snippets of sorts, where I would write CSS properties just by 1-to-3 letter combinations, with results coming straight from autocomplete's memory of my last usage. Fast, easy to remember and, best of all, simple.


Second, command palette. There are a couple of commands that aren't used constantly (and I don't want keyboard shortcuts for such commands) but come in handy once or twice a month, like converting the current file buffer from spaces to tabs or the other way around.

To do this, I'd open the command palette, type in "tabs" or "spaces" and choose the appropriate command from the list. In Sublime, I'd already have preselected the command I chose the last time I typed the same input: tabs would pre-select "Indentation: Convert to Tabs" and spaces would pre-select "Indentation: Convert to Spaces". And even though there are other results for those queries, and some of them may be higher-scoring (as indicated by the order of items), the default command when I hit Return is the one I actually use out of those results.

Instead, in Atom when I type tabs into the command palette, I have to scroll or arrow-down to 10 items lower to get the one I wanted, it's ridiculous.


Right now, Atom doesn't do this. Instead every time I run an often-used autocomplete input or search for a command in the command palette, I get the same ordered-by-score results and have to select the one that I want, which in most cases would have been the same one as always. Sublime does this and it's a huge time saver, I'd love to see Atom add this functionality.

And yes, I could add obscure keyboard shortcuts and manually type out snippets for every single shortcut, but that wouldn't be very productive or flexible.

Match path after filename

If there are multiple files with the same filename in a project, typing the filename presents them all but you cannot enter more text to narrow them down by path. For example:

  • search/README.md
  • database/README.md

If I enter "README" I'll get both files but cannot enter "search" or "database" to select from them. If I enter "search READ" it will narrow it to a single file, but I don't always know I'll have multiple matches until the filenames have been presented. (Think "index.js" or "index.coffee" in a large project.)

By switching the search order, I could enter "README", see there are multiple matches, and start typing "search" to narrow it down to the one I wanted.

Can I disable sorting?

Is it possible to disable the sorting of results?

I have a use for fuzzaldrin in which the sorting is not beneficial, but the matching is. Currently I run a set of sorted candidates through the filter and then have to resort them -- not horrible, but still an extra step that would be nice not to do.

I do use both filtering and sorting in other cases. I'd like to use the same fuzzysearch library across all filtering instances.

Consider \ and / interchangable?

More related to PHP specifically, but it would allow namespaces such as:

JMS\Serializer\Annotation

to look for

JMS/Serializer/Annotation

Since 99% of the time, filenames wont have contain " \ " , it would generally improve the results.

Favour case sensitive matches

2015-06-09-133345_669x461_scrot OmniSharp/omnisharp-atom#380

So in this screenshot, I'd like the local variable diagnostics to be the first item in the list here as it is the first match that has a prefix with identical case.

Is this something that you might consider adding (maybe via an option) ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.