Comments (7)
Hello @Globegitter,
Well this is rather unfortunate. I'll give it a look as soon as possible. This bugs applies to the 0.1.9
version or the 0.1.8
?
For other cases, does the algorithm work correctly?
from clj-fuzzy.
@Yomguithereal It applies to both. Otherwise it seems to be working really well - thanks for the library, it is really useful.
from clj-fuzzy.
@Globegitter,
I've checked this and can affirm the bug comes from the clojure part and therefore replicates into its JavaScript counterpart.
I can fix it but I have a problem here and you might be able to help me:
The Dice coefficient works using bigrams. So, traditionnally, if you compare h
and h
, this will return 0, which is a total nonsense since both strings are the same.
So here is the choice I have to make:
- Follow most of the classical mathematical implementations of the algorithm and bear that on strings with less than two characters the coefficient will produce nonsense.
- Create a finer implementation that would deal with this edge case and return correct similarities on an intellectual level.
Any opinion?
from clj-fuzzy.
I've fixed the implementation. You can install the latest dev version with the following command for node if needed:
npm i git+https://github.com/Yomguithereal/clj-fuzzy.git
from clj-fuzzy.
Oh that is great thank you! How did you resolve it then?
from clj-fuzzy.
Second choice. I found other libraries - in python notably - that prefer to fix the rationale of the algorithm. So I went with that so now h
/ h
--> 1.0
.
from clj-fuzzy.
Awesome thank you, will test asap.
from clj-fuzzy.
Related Issues (20)
- Fix Levenshtein ortographe
- Switch to gulp
- Fix the cljx pb
- Lovins stemmer
- Rework headers and descriptions
- Repo needs `index.js` so that it can be invoked with `require` HOT 3
- Switch to cljx or feature expressions
- Levenstein distance performance HOT 6
- issue using this project as a dependency in clojurescript HOT 17
- Cologne phonetic will drop the first place 0 erroneously
- Something seems to be amiss with Jaro distance & long strings
- Levenshtein Distance Error On Empty Sequence HOT 1
- Clojurescript should be a dev dependency HOT 5
- Spanish support? HOT 13
- Documentation website outdated HOT 6
- Jaro-Winkler returns unexpected values for two nil inputs HOT 4
- Big-O Performance HOT 2
- interferes with other modules, do NOT use this module unless it's fixed HOT 1
- Is it appropriate to add a non-formal algorithm? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clj-fuzzy.