Comments (10)
Okay -
The following works.
2 entries -
mandymoore
mandynomore
search for mandy
both come up.
but [email protected] and [email protected]
search for mandy, mandy2 wont come up.
problem seems to be whenever there is a digit in the string.
partial match fails then.
from tntsearch.
Let me try to explain how it works and why the word mandy2
won't come up.
Before saving each word to the index, we do a process called stemming, which basically takes the
common root of a word. Since documents are going to use different forms of a word, such as organize
, organizes
, and organizing
the stemming process is needed to be able to give you meaningful results and not to give you an empty result set if you don't know the right word form.
The default stemmer that we use for TNTSearch is the PorterStemmer and if you feed it with the
word mandy
the stem becomes mandi
, and if you feed it with mandy2
it becomes mandy2
.
So to sum it up, it treats mandy
and mandy2
as two different words
from tntsearch.
Thanks for the explanation.
But, that means if someone signs up on the site as [email protected], then if i try to search for jennifer i won't get back any results.
from tntsearch.
That's correct because the text isn't broken down by numbers. However, you can change this by
implementing your own Tokenizer. Here's the current implementation
https://github.com/teamtnt/tntsearch/blob/master/src/Support/Tokenizer.php
from tntsearch.
I've got the similar problem, but I am not entirely sure that it has something to do with the current explanation... Taking into account that fuzziness is enabled.
I've debugged the execution and it seems that some words are simply ignored.
The problem is in the method 'getAllDocumentsForKeyword' (TNTSearch.php:219). In my example 'getWordlistByKeyword' returns 2 words with obviously different ids, but when binding the value, $word[0]['id'] is used, meaning that only one word is used. In my opinion the select statement should search for all of the found words / ids. I didn't look when the tokenizer is applied but at that point, this probably isn't the expected behaviour.
from tntsearch.
Can you provide a concrete example and I'll try to explain
from tntsearch.
Sure, I have some screenshots, it should be easier. The search phrase is 521 and the keywords saved in TNT db are 521c and 521m. I'm using default fuzzy search config - distance = 2, prefix = 2.
In my case the second keyword is ignored, because the select statement always returns only the first id.
from tntsearch.
@Sciyguy ok, I understand. We should probably add this only when fuzziness is enabled
from tntsearch.
Closing this one, since the issue should be solved if fuzziness is enabled.
from tntsearch.
There seems to be a bug here, asYouType enabled should find both mandy
and mandy2
which it does
Until you enable fuzziness, and then it doesn't find any partial word
from tntsearch.
Related Issues (20)
- tntsearch Deprecated: Creation of dynamic property HOT 1
- Anyone know what this random SMS-Texts file is? HOT 2
- Diacritic-Insensitive Search Support (Czech characters) HOT 3
- Scout: Custom tokenizer indexing properly to allow dashes and periods, but searching on dashes does not work HOT 9
- Performance issues with large datasets HOT 6
- Class 'TeamTNT\TNTSearch\Engines\Exception' not found in 'vendor/teamtnt/tntsearch/src/Engines/EngineTrait.php' line 46 HOT 1
- Per-Model Fuzzy Search Configuration in Laravel Scout HOT 1
- [FEATURE] Support of PSR-16 adapter
- How to add MYSQL_ATTR_SSL_CA option? HOT 1
- $startpos adjustment may return minus value. HOT 1
- How to update index for which no index.
- Fuzziness / Fuzzy-Search not working HOT 3
- Scout Driver - Model update or save dont trigger tntsearch index update HOT 3
- new TNTGeoIndexer expects engine
- Why add 'return' in saveHitList function? HOT 3
- Inaccurate results when searching two or more keywords. HOT 7
- In-depth Instructions HOT 1
- CLI re-index output is confusing
- How search with mysql and insensitive accents where 'a' = 'รก'
- How to perform a Boolean AND search with TNTSearch? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tntsearch.