Coder Social home page Coder Social logo

fuzzy-search's Introduction

fuzzy-search

Fuzzy text searching plugin for Meteor based on Levenshtein distance algorithm. You can use it to implement "Did you mean..." feature in your program (and more).

Function mostSimilarString searches given cursor for search string and returns the most similar one. If you have "beer", "juice" and "milk" searching for string "bear" will return "beer". This also works with multiple words: if you search for "Nors Chuk" you will get "Chuck Norris".

How to install

  1. meteor add perak:fuzzy-search

How to use

Function searches given cursor for search string and returns the most similar one.

Syntax:

mostSimilarString(cursor, fieldName, searchString, maxDistance, caseSensitive)

Arguments:

cursor meteor cursor object

  • data type: object
  • default value: none

fieldName field name to search in

  • data type: string
  • default value: none

searchString string to search

  • data type: string
  • default value: none

maxDistance is used to limit result to less-or-more similar string in small datasets

  • data type: integer
  • default value: -1
  • -1 means "auto". Function will automaticaly set max_distance based on search_string.
  • undefined means "no limit". Searching for string "beer" can return a "wife". We don't want that! :)

caseSensitive

  • data type: bool
  • default value: false

Return value:

Function will return most similar word or empty string if similar word is not found (if best word distance is greater than max_distance).

Example:

// If we have a collection named "Drinks" which contains "beer", "juice" and "milk"

var searchString = "bear"; // user typed "bear" instead of "beer"

// search "Drinks" collection for string "bear"
var someCursor = Drinks.find({ drink_name: searchString });

// "bear" is not found, so we want to find most similar word to give user suggestion (Did you mean...)
if(someCursor.count() == 0)
{
	// expose entire collection
	var tempCursor = Drinks.find({ }, { drink_name: true });

	// find most similar string
    var bestWord = mostSimilarString(tempCursor, "drink_name", searchString, -1, false);

    // in this example, bestWord is "beer", show user a suggestion: "Did you mean beer?"
    // ...
}

##History

####0.1.9

  • Field name now can be in "dot notation", e.g. "data.name". Thanks to Tiago.

####0.1.8

  • Added case-insensitive search

##Contribute

Feel free to report issues, request features, performance improvements etc.

##License MIT

fuzzy-search's People

Contributors

perak avatar techplexengineer avatar tfbrito avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fuzzy-search's Issues

Getting multiple results?

Hello, this package is working very well for me, but instead of just showing "mostSimilarString" I'd like to show the most similar strings (plural). Specifically, I'm searching a list of contacts and I'd love to say "I don't see John Smith, did you mean "jane smith, jon smith, john myth, joan smith, jimmy smith"... is there a way to get multiple results out of this?

Add more levels to search

Hi. I just noticed that it wont work to search for example in a field like "question.name"

so I changed the code to support it.
From line 101 to 119.

changes are in lines 102 and 117

cursor.forEach(function(doc) {
var candidate = doc;
var slice = fieldName.split('.');
for(var i=0; i<slice.length;i++){
var candidate = candidate[slice[i]];
}
if(!caseSensitive) candidate = candidate.toUpperCase();
// split string into words
var arrayB = candidate.split(" ");
// calculate sum distance
// if both strings are single words return simple distance
var dist = 0;
if(arrayA.length <= 1 && arrayB.length <= 1)
dist = levenshteinDistance(searchString, candidate);
else
dist = levenshteinDistanceExt(arrayA, arrayB);

    if(dist < min_distance && dist < maxDistance)
    {
        min_distance = dist;
        best_word = candidate;
    }
});

Multiple Words

Hello,

It is mentioned that you can return multiple words like: Nors Chuk" you will get "Chuck Norris".

But, when doing a phrase such as: Nors Chuk. I'll get the whole phrase:
Is Chuck Norris an actor?, instead of just Chuck Norris.

Is there a way to limit the results to the same amount of words in the query? I tried:
candidate.split(" ").splice(0,searchcount).join(" ");

This gives me the correct amount of words, but it doesn't give me spot in the phrase I need. Maybe I am doing it wrong?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.