Coder Social home page Coder Social logo

fwextensions / quick-score Goto Github PK

View Code? Open in Web Editor NEW
156.0 5.0 5.0 1.2 MB

A JavaScript string-scoring and fuzzy-matching library based on the Quicksilver algorithm, designed for smart auto-complete.

License: MIT License

JavaScript 99.04% CSS 0.96%
string-scoring smart-sort auto-complete fuzzy-search quicksilver

quick-score's Introduction

QuickScore

quick-score is a JavaScript string-scoring and fuzzy-matching library based on the Quicksilver algorithm, designed for smart auto-complete.

Code Coverage Build Status Minzip Size MIT License

QuickScore improves on the original Quicksilver algorithm by tuning the scoring for long strings, such as webpage titles or URLs, so that the order of the search results makes more sense. It's used by the QuicKey extension for Chrome to enable users to easily find an open tab via search.

QuickScore is fast, dependency-free, and is just 2KB when minified and gzipped.

Demo

See QuickScore in action, and compare its results to other scoring and matching libraries.

Install

npm install --save quick-score

If you prefer to use the built library files directly instead of using npm, you can download them from https://unpkg.com/browse/quick-score/dist/.

Or you can load a particular release of the minified script directly from unpkg.com, and then access the library via the quickScore global:

<script src="https://unpkg.com/[email protected]/dist/quick-score.min.js"></script>
<script type="text/javascript">
    console.log(quickScore.quickScore("thought", "gh"));
</script>

Usage

Calling quickScore() directly

You can import the quickScore() function from the ES6 module:

import {quickScore} from "quick-score";

Or from a property of the CommonJS module:

const quickScore = require("quick-score").quickScore;

Then call quickScore() with a string and a query to score against that string. It will return a floating point score between 0 and 1. A higher score means that string is a better match for the query. A 1 means the query is the highest match for the string, though the two strings may still differ in case and whitespace characters.

quickScore("thought", "gh");   // 0.4142857142857143
quickScore("GitHub", "gh");    // 0.9166666666666666

Matching gh against GitHub returns a higher score than thought, because it matches the capital letters in GitHub, which are weighted more highly.

Sorting lists of strings with a QuickScore instance

A typical use-case for string scoring is auto-completion, where you want the user to get to the desired result by typing as few characters as possible. Instead of calling quickScore() directly for every item in a list and then sorting it based on the score, it's simpler to use an instance of the QuickScore class:

import {QuickScore} from "quick-score";

const qs = new QuickScore(["thought", "giraffe", "GitHub", "hello, Garth"]);
const results = qs.search("gh");

// results =>
[
    {
        "item": "GitHub",
        "score": 0.9166666666666666,
        "matches": [[0, 1], [3, 4]]
    },
    {
        "item": "hello, Garth",
        "score": 0.6263888888888888,
        "matches": [[7, 8], [11, 12]]
    },
    // ...
]

The results array in this example is a list of ScoredString objects that represent the results of matching the query against each string that was passed to the constructor. It's sorted high to low on each item's score. Strings with identical scores are sorted alphabetically and case-insensitively. In the simple case of scoring bare strings, each ScoredString item has three properties:

  • item: the string that was scored
  • score: the floating point score of the string for the current query
  • matches: an array of arrays that specify the character ranges where the query matched the string

This array could then be used to render a list of matching results as the user types a query.

Sorting lists of objects

Typically, you'll be sorting items more complex than a bare string. To tell QuickScore which of an object's keys to score a query against, pass an array of key names or dot-delimited paths as the second parameter to the QuickScore() constructor:

const bookmarks = [
    {
        "title": "lodash documentation",
        "url": "https://lodash.com/docs"
    },
    {
        "title": "Supplying Images - Google Chrome",
        "url": "developer.chrome.com/webstore/images"
    },
    // ...
];
const qs = new QuickScore(bookmarks, ["title", "url"]);
const results = qs.search("devel");

// results =>
[
    {
        "item": {
            "title": "Supplying Images - Google Chrome",
            "url": "developer.chrome.com/webstore/images"
        },
        "score": 0.9138888888888891,
        "scoreKey": "url",
        "scores": {
            "title": 0,
            "url": 0.9138888888888891
        },
        "matches": {
            "title": [],
            "url": [[0, 5]]
        }
    },
    // ...
]

When matching against objects, each item in the results array is a ScoredObject, with a few additional properties :

  • item: the object that was scored
  • score: the highest score from among the individual key scores
  • scoreKey: the name of the key with the highest score, which will be an empty string if they're all zero
  • scoreValue: the value of the key with the highest score, which makes it easier to access if it's a nested string
  • scores: a hash of the individual scores for each key
  • matches: a hash of arrays that specify the character ranges of the query match for each key

When two items have the same score, they're sorted alphabetically and case-insensitively on the key specified by the sortKey option, which defaults to the first item in the keys array. In the example above, that would be title.

Each ScoredObject item also has a _ property, which caches transformed versions of the item's strings, and might contain additional internal metadata in the future. It can be ignored.

TypeScript support

Although the QuickScore codebase is currently written in JavaScript, the package comes with full TypeScript typings. The QuickScore class takes a generic type parameter based on the type of objects in the items array passed to the constructor. That way, you can access .item on the ScoredObject result and get back an object of the same type that you passed in.

Ignoring diacritics and accents when scoring

If the strings you're matching against contain diacritics on some of the letters, like à or ç, you may want to count a match even when the query string contains the unaccented forms of those letters. The QuickScore library doesn't contain support for this by default, since it's only needed with certain strings and the code to remove accents would triple its size. But it's easy to combine QuickScore with other libraries to ignore diacritics.

One example is the latinize npm package, which will strip accents from a string and can be used in a transformString() function that's passed as an option to the QuickScore constructor. This function takes a string parameter and returns a transformed version of that string:

// including latinize.js on the page creates a global latinize() function
import {QuickScore} from "quick-score";

const items = ["Café", "Cafeteria"];
const qs = new QuickScore(items, { transformString: s => latinize(s).toLowerCase() });
const results = qs.search("cafe");

// results =>
[
    {
        "item": "Café",
        "score": 1,
        "matches": [[0, 4]],
        "_": "cafe"
    },
    // ...
]

transformString() will be called on each of the searchable keys in the items array as well as on the query parameter to the search() method. The default function calls toLocaleLowerCase() on each string, for a case-insensitive search. In the example above, the basic toLowerCase() call is sufficient, since latinize() will have already stripped any accents.

Highlighting matched letters

Many search interfaces highlight the letters in each item that match what the user has typed. The matches property of each item in the results array contains information that can be used to highlight those matching letters.

The functional component below is an example of how an item could be highlighted using React. It surrounds each sequence of matching letters in a <mark> tag and then returns the full string in a <span>. You could then style the <mark> tag to be bold or a different color to highlight the matches. (Something similar could be done by concatenating plain strings of HTML tags, though you'll need to be careful to escape the substrings.)

function MatchedString({ string, matches }) {
    const substrings = [];
    let previousEnd = 0;

    for (let [start, end] of matches) {
        const prefix = string.substring(previousEnd, start);
        const match = <mark>{string.substring(start, end)}</mark>;

        substrings.push(prefix, match);
        previousEnd = end;
    }

    substrings.push(string.substring(previousEnd));

    return <span>{React.Children.toArray(substrings)}</span>;
}

The QuickScore demo uses this approach to highlight the query matches, via the MatchedString component.

API

See the API docs for a full description of the QuickScore class and the quickScore function.

License

MIT © John Dunning

quick-score's People

Contributors

fwextensions avatar john-dunning avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

quick-score's Issues

Odd behaviour in search sesults

Hi,

I have noticed that when searching for a word, results containing all of the letters in that word, even if they are separated by other letters, often appear above results where the word appears in its entirety.

For example when searching "ham" in the example below I would expect this to show the result with "ham" in it first? Is this a deliberate design feature? Is there a way of setting it to take into account the distance between letters?

Thanks for your help.

a1

a2

`setKeys()` and `setItems()` do not support ArrayLike Iterables because they use `[].concat()` for a shallow copy

If I were to call setItems with something like a mobx ObservableArray, to JS it is an iterable object.
So when concat is called at the beginning of the setItems function, it doesn't properly concat, due to not being an actual array. (turns into an array that contains a mobx array that contains an object).

I know this is not exactly the libraries fault, but it would be nice to support ArrayLike Iterables.

I suggest that the spread operator is used instead to copy:

[...items] as opposed to [].concat(items)

Thanks!

Can't use query long strings

I have a lot of input data and so I've been using quick-score to sort by query quickly as opposed to fuse.js and I've found that it works great until the string goes over 150 characters till which it hangs and never finishes.

Is there a config method I could set or is this just a bug?

Much appreciated

Dan

My code:

const { QuickScore } = require("quick-score");
const qs = new QuickScore(array, ["title", "url"]); // array is about 500-1000 items long and grows over time
qs.search("test") // about 2-9ms
qs.search("really long test") // never finishes

Possible place why:
/src/config.js

Limit result

Hi all,
How can I limit the count of result items when I run the search on a array with over 10.000 items?
Thans

Option to check similarity with unused/extraneous characters

When doing searches on a QuickScore class, if a query contains extra characters than the options, it will simply fail, even if its just one character off.

Example: Wanted word in list is "Apple" and the query is "Xpple", even though its one character away from being a 1.0, it will fail to have it in results.

Scores that are equal to minimumScore are excluded

This may or may not be a bug, or perhaps an ambiguity in the documentation.

If minimumScore = 0, items with score = 0 will not be returned in the result list. Setting a negative value is required.

Should this be >= instead of >? Or should the documentation reflect this logic?

Allow a threshold below which results are not returned

This is a really great library, thanks!

It would be useful if the user could specify a minimum score, below which results are not returned.

This would help with performance for those of us using this on large data sets.

Thanks,
Mike

Would this project be open to using TypeScript?

I work somewhere that has been using a 0.0.5 version (I think) of this library, but without use of const, and we are planning on swapping over to using this package full on.

And as much as the JSDoc comments are super useful, having either a declaration file or full TS support in the files would be super useful.

Would this project be open to me forking and converting it to TS?

Thanks!

[Feature request] Support scoring arrays of strings, in addition to single strings

Hi there! Thanks for this library. I've been trying out different search libraries for a project, and have been really impressed thus far with quickscore!

One feature that would be super helpful is support for keys that reference non-string values.

Example 1: Array

const bookmarks = [
    {
        "title": "lodash documentation",
        "url": "https://lodash.com/docs",
        "labels": ["work docs", "recently added"]
    },
    {
        "title": "Supplying Images - Google Chrome",
        "url": "developer.chrome.com/webstore/images",
        "labels": ["favorites", "hobby"]
    },
    ...
];

It'd be really useful to have support for const qs = new QuickScore(bookmarks, ["title", "labels"]);.

One quick solution would be to join the array to a string, but that has some difficulties with search matches. I.e., "work added" would match the first result with a high score, when it should probably have a lower score if each label was considered separately.

Example 2: Array of objects

This would be really cool, but also handleable by pre-processing if Example 1 is supported.

const bookmarks = [
    {
        "title": "lodash documentation",
        "url": "https://lodash.com/docs",
        "labels": [{"name": "work docs", "id": 1},{"name": "recently added", "id": 2}]
    },
    {
        "title": "Supplying Images - Google Chrome",
        "url": "developer.chrome.com/webstore/images",
        "labels": [{"name": "favorites", "id": 3},{"name": "hobbies", "id": 4}]
    },
    ...
];

const qs = new QuickScore(bookmarks, ["title", "labels.name"]);

I think fuse supports both of these use cases, but haven't seen other libraries that do. Would be awesome if quickscore supported it!

[Feature request] Option to ignore dots in key names and to score all keys, without specifying a list

First off: thanks for creating and sharing this wonderful library!

I'm having a bit of trouble searching through my objects. My objects use http URLs as keys, and I want to search through them. Since I'm searching in Objects, I have to pass an array of keys to my QuickScore options:new QuickScore(resourceObjArray, qsOpts).

However, if I pass an array of URLs, it will parse the dots inside them as paths for new keys. E.g. this key:

https://atomicdata.dev/properties/shortname

becomes:

https://atomicdata dev/properties/shortname

I'm not entirely sure if this is the reason why I can't search in my URL properties, but it seemed logical.

One possible solution is to change the keys opts to not use the dot syntax to denote subkeys, but use arrays. A bit more convenient for most, but it would mean the library becomes usable for libraries that use dots in keys.

quickScore is not defined

Hi,
i included the script from unpkg like this:

<script src="https://unpkg.com/browse/[email protected]/dist/quick-score.js"></script>

(I tried also 0.6)

but this is the result when i try to call it in console:

quickScore("GitHub", "gh");
01:01:39.263 VM782:1 Uncaught ReferenceError: quickScore is not defined
at :1:1

Am i missing something?

Distribute as a JavaScript file?

Hi!

QuickScore looks very impressive and I really like how minimal it is. But I'm wondering if you can also distribute it as a ready JavaScript file?

My project doesn't use npm/node, and trying to build from the repository's source gives all kinds of errors on my computer. I tried to solve most of them, but I'm not that good at npm/node and the likes.

Yet I still would like and try to use QuickScore for my static website. 🙂

Option to ignore accents (diacritics)

I’m using quick-score in accents/diacritics supported language (Croatian). Sometimes I will search with diacritics, sometimes not, but it would be nice to normalize string which is used to search items.

Currently, I’m using node-diacritics to remove/replace diacritics to standard ASCII characters, on search query and on results, but this returns results where diacritics are already removed instead of original item.

Maybe add option to transform query and item string?

import { QuickScore } from 'quick-score';
import diacritics from 'diacritics';
import traverse from 'traverse';

traverse(json).forEach(function(value) {
	if (this.notRoot) {
		if (typeof value === 'string') {
			this.update(diacritics.remove(value));
		}
	}
});

// …

search.addEventListener('input', (e) => {
	const result = qs.search(diacritics.remove(el.value));
});

Question about original QuickSilver algorithim

In the README you mention that this implementation is based on the Quicksilver algorithm. Do you mind including a source to go to because I can't seem to fins anything about it anywhere.

Implement highlight for matches

QuickScore for list sorting could return a representation of the strings with highlighted matches in the extended values.

[
    {
        "item": "GitHub",
        "score": 0.9166666666666666,
        "matches": [[0, 1], [3, 4]],
        "highlighted": "<b>G</b>it<b>H</b>ub"
    },
    {
        "item": "hello, Garth",
        "score": 0.6263888888888888,
        "matches": [[7, 8], [11, 12]],
        "highlighted": "hello, <b>G</b>art<b>h</b>"
    },
    ...

Could use a function like this:

function highlight(msg, matches)
{
  let m = msg
  for ([start, end] of matches.reverse())
  {
    m = m.slice(0, start) + m.slice(start, end).bold() + m.slice(end, m.length)
  }
  return m
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.