Coder Social home page Coder Social logo

yomguithereal / clj-fuzzy Goto Github PK

View Code? Open in Web Editor NEW
261.0 10.0 27.0 780 KB

A handy collection of algorithms dealing with fuzzy strings and phonetics.

Home Page: http://yomguithereal.github.io/clj-fuzzy/

License: MIT License

JavaScript 0.09% Clojure 99.88% Shell 0.03%

clj-fuzzy's Introduction

Build Status

clj-fuzzy

clj-fuzzy is a native Clojure library providing a collection of famous algorithms dealing with fuzzy strings and phonetics.

It can be used in Clojure, ClojureScript, client-side JavaScript and Node.js.

Clojars Project

Deprecation warning

Consider this library deprecated for JavaScript.

Indeed, the Talisman library can be seen as an improvement over clj-fuzzy and is, what's more, written directly in JavaScript.

Full documentation

The full documentation for this library is available there.

Available algorithms

Metrics

Stemmers

Phonetics

Contribution

Please feel free to contribute by forking this repo. Just be sure to add relevant unit tests and pass them all before submitting any code.

License

MIT

clj-fuzzy's People

Contributors

arilitan avatar devth avatar kennyjwilli avatar masztal avatar tkocmathla avatar yomguithereal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clj-fuzzy's Issues

Levenshtein Distance Error On Empty Sequence

If the first sequence passed to the levenshtein distance function is empty, an exception is thrown:

(fuzzy/levenshtein "" "abc")
ClassCastException clojure.lang.LazySeq cannot be cast to clojure.lang.IPersistentStack  clojure.lang.RT.peek (RT.java:710)

Whereas:

(fuzzy/levenshtein "abc" "")
3

An empty sequence for first position needs to be handled.

Spanish support?

There is no mention whatsoever about language support.
Schinke stemmer is supposed to be latin but it doesn't work as expected.
Thanks.

dice algorithm NaN

Just trying out this algorithm and seems the dice algorithm has some minor bugs (or I am not understanding it quite right):
screen shot 2014-11-13 at 09 16 01

These are the results I am getting with strings of length 0 and length 1, could this have anything to do with the input being characters rather than actual strings? Is that as intended?

Documentation website outdated

Hi,

I ran out on the issue of the dependency not found in Clojurescript until I found out I was using an outdated version of the lib 0.1.8, following install steps documented here http://yomguithereal.github.io/clj-fuzzy/clojure.html.

Looking more closely, website indicates Currently v0.3.2 in the sidebar and 0.1.8 in the Clojure install page, which are both wrong according to Clojars.

Is there anything I can do to help ?

Thanks!

Levenstein distance performance

The levenshtein/distance function has very poor performance for even short strings โ€“ is this a known issue? It takes 10 seconds on a MacBook Pro (3 GHz Intel Core i7) running Java 8.

user=> (time (clj-fuzzy.levenshtein/distance "feature" "get-project-features"))
"Elapsed time: 10438.251547 msecs"
13

Big-O Performance

Great to see a library like this. I would love to see the Big-O performance of each fuzzy algorithm displayed so I know what size of data I can it for and maybe some advice about pros and cons.

I'm doing some fuzzy matching for accounting purposes ("McDonalds": "is this a business expense? probably not") and I wouldn't know which algo to pick to save the most time

issue using this project as a dependency in clojurescript

Hello! Thanks for writing this cool library :).

I am using clojurescript version "0.0-3165" and clojure version "1.7.0-beta1" and I am unable to depend on clj-fuzzy from clojurescript. If it makes a difference, I am using the boot build tool.

Clojurescript should be a dev dependency

Onyx (https://github.com/onyx-platform/onyx) uses clj-fuzzy as a dependency, however we have to exclude clojurescript, as it is an unnecessary dependency for clojure users, and can cause conflicts in our user's projects. I think you would be best served by making it a dev dependency, as any of your cljs users will need clojurescript as a dependency anyway.

Thanks for a great project!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.