Comments (10)
Hello Daniel,
Apologies for the delay in getting back to you regarding this. Are you still having this problem?
Did you split the fingerprint codes into overlapping segments upon ingestion or are you ingesting full codes? In other words, how did you run the ingestion, and was split=True or split=False set when calling fp.py:ingest?
What happens when you make rapid, repeated queries of the Solr index? Do all the queries take a long time or
Andrew
from echoprint-server.
Hello Andrew,
We are still receiving this problem and is consistent through upgrades.
Multiple queries to the exactly same result set will become faster but not significantly faster. The lowest we can get is 1.9 seconds. The highest was 28 seconds and it was the only query being performed on the system.
We ingested full length codes and we used split=True (as defined in fp.py).
All the queries take long time.
from echoprint-server.
I experience exactly the same issue... Did you find a solution @danielbenzvi ?
from echoprint-server.
Lately we have been investigating this issue in detail and have found that when the Solr database becomes very large then performance upon querying can indeed suffer in this way, if the entire index is deployed onto a single Solr core on one server.
We've found various solutions that have helped tremendously in reducing the time required to perform a query (e.g., improvements in time of about an order of magnitude). One of these solutions involves sharding, which requires a more complicated Solr setup. Another solution involves changes to the way the fingerprints are actually indexed and queried. We have had great success in running these improvements on our servers that are behind the song/identify method on our API.
We will push out source code when it is ready for GitHub, for example, to make a more sophisticated Solr configuration easier to deploy out-of-the-box (no ETA yet). But this will most likely involve large changes to the back end rather than tweaking the current setup.
from echoprint-server.
Andrew, Could you provide a little more detail as to how you've adjusted the indexing and queries to improve the Solr query times? I'm struggling getting an acceptable response time for a large collection and am interesting in any direction you may be able to provide to assist. thanks.
from echoprint-server.
Same problem here guys, Solr is taking too long to answer.. I get response times around 5 seconds per query, I used the patches suggest by Justin Haygood (https://groups.google.com/forum/#!topic/echoprint/J7MQftCfpCM) which improved the recognition significantly. Any ideas ?
from echoprint-server.
Increasing the density of hash codes will improve the OTA recognition rate, but this will also make the Solr part of the search significantly slower.
The overall ideas in improving scalability of the index are the following:
- to reduce the number of hash codes down by omitting uninformative hash codes from the index and queries
- improving the efficiency of the search algorithm in Solr (Solr 4.x already has a patch for this but to use it we obviously need to upgrade from Solr 1.4 to Solr 4).
- changing the architecture of the index itself
We've already tried the first approach. It improves the results but it is a hack, and the other approaches are better.
from echoprint-server.
Thanks for the fast answer Andrew! I actually ran a few tests that surprised me (with Justin Haygood's patch):
- the OTA recognition quality is paired with some proprietary stuff that I tried in the past
- Every time I run a query - it doesn't matter if I only have 1 song, 100 or 2000 on the DB) it always takes around 4 or 5 seconds (if I run the same query again it takes around 20 ms on my local machine), how can this be happening ? Shouldn't it be blazing fast with a basically empty database ?
So replacing Solr for the newest version should fasten it right ? I'll give it a try :)
Thanks for the help!
ZĂŠ
from echoprint-server.
Hi Andrew, Thanks for the response. I wasn't able to reach the C experimental repo either. I'd be interested in taking a look.
I did take a look at migrating to Solr 4.x, but it looks like there are a few functions that have been deprecated that prevent the hashr from compiling. I did try using a version from another user that utilizes Maven to compile other versions, but it would only compile to 3.x.
When you mention an uninformative hash, could you help me understand what type of hash value would be uninformative?
Thanks.
from echoprint-server.
I am having the same issue here. Does anyone have a solution?
from echoprint-server.
Related Issues (20)
- track_id with spaces can't be looked up in fp.metadata_for_track_id and fp.delete HOT 2
- Ingesting the same song twice won't return results at lookup HOT 1
- API Leak.. HOT 1
- problem with best_match_for_query HOT 7
- Problem Accuracy echoprint-server HOT 1
- Periodic and seemingly random errors returned from TokyoTyrant on misc getlist commands HOT 6
- How to delete ingested records from tokyo cabinet? HOT 1
- SOLR shows result = 1 but API shows no results... anyone see this before?
- Possible bug in HashQueryComponent
- Is it possible to embed server in IOS or Android app? HOT 1
- No matches on popular songs
- An exception while integrating Hashr.jar into a collection in Solr HOT 1
- Socket Error HOT 3
- Ingest live stream from remote server? HOT 1
- Only run the server in localhost
- Get Metadata from local echoprint-server HOT 1
- How to host echoprint server instance on heroku?
- what's reason aboat the socket.error?
- Where can I find the EchoPrint database ? http://echoprint.me/data/ was taken down. HOT 5
- Execute a backup
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
đ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. đđđ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google â¤ď¸ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from echoprint-server.