Coder Social home page Coder Social logo

Comments (10)

alnesbit avatar alnesbit commented on July 16, 2024

Hello Daniel,

Apologies for the delay in getting back to you regarding this. Are you still having this problem?

Did you split the fingerprint codes into overlapping segments upon ingestion or are you ingesting full codes? In other words, how did you run the ingestion, and was split=True or split=False set when calling fp.py:ingest?

What happens when you make rapid, repeated queries of the Solr index? Do all the queries take a long time or

Andrew

from echoprint-server.

danielbenzvi avatar danielbenzvi commented on July 16, 2024

Hello Andrew,
We are still receiving this problem and is consistent through upgrades.

Multiple queries to the exactly same result set will become faster but not significantly faster. The lowest we can get is 1.9 seconds. The highest was 28 seconds and it was the only query being performed on the system.

We ingested full length codes and we used split=True (as defined in fp.py).

All the queries take long time.

from echoprint-server.

Tomtomgo avatar Tomtomgo commented on July 16, 2024

I experience exactly the same issue... Did you find a solution @danielbenzvi ?

from echoprint-server.

alnesbit avatar alnesbit commented on July 16, 2024

Lately we have been investigating this issue in detail and have found that when the Solr database becomes very large then performance upon querying can indeed suffer in this way, if the entire index is deployed onto a single Solr core on one server.

We've found various solutions that have helped tremendously in reducing the time required to perform a query (e.g., improvements in time of about an order of magnitude). One of these solutions involves sharding, which requires a more complicated Solr setup. Another solution involves changes to the way the fingerprints are actually indexed and queried. We have had great success in running these improvements on our servers that are behind the song/identify method on our API.

We will push out source code when it is ready for GitHub, for example, to make a more sophisticated Solr configuration easier to deploy out-of-the-box (no ETA yet). But this will most likely involve large changes to the back end rather than tweaking the current setup.

from echoprint-server.

ranger123 avatar ranger123 commented on July 16, 2024

Andrew, Could you provide a little more detail as to how you've adjusted the indexing and queries to improve the Solr query times? I'm struggling getting an acceptable response time for a large collection and am interesting in any direction you may be able to provide to assist. thanks.

from echoprint-server.

zemariamm avatar zemariamm commented on July 16, 2024

Same problem here guys, Solr is taking too long to answer.. I get response times around 5 seconds per query, I used the patches suggest by Justin Haygood (https://groups.google.com/forum/#!topic/echoprint/J7MQftCfpCM) which improved the recognition significantly. Any ideas ?

from echoprint-server.

alnesbit avatar alnesbit commented on July 16, 2024

Increasing the density of hash codes will improve the OTA recognition rate, but this will also make the Solr part of the search significantly slower.

The overall ideas in improving scalability of the index are the following:

  • to reduce the number of hash codes down by omitting uninformative hash codes from the index and queries
  • improving the efficiency of the search algorithm in Solr (Solr 4.x already has a patch for this but to use it we obviously need to upgrade from Solr 1.4 to Solr 4).
  • changing the architecture of the index itself

We've already tried the first approach. It improves the results but it is a hack, and the other approaches are better.

from echoprint-server.

zemariamm avatar zemariamm commented on July 16, 2024

Thanks for the fast answer Andrew! I actually ran a few tests that surprised me (with Justin Haygood's patch):

  • the OTA recognition quality is paired with some proprietary stuff that I tried in the past
  • Every time I run a query - it doesn't matter if I only have 1 song, 100 or 2000 on the DB) it always takes around 4 or 5 seconds (if I run the same query again it takes around 20 ms on my local machine), how can this be happening ? Shouldn't it be blazing fast with a basically empty database ?

So replacing Solr for the newest version should fasten it right ? I'll give it a try :)

Thanks for the help!
ZĂŠ

from echoprint-server.

ranger123 avatar ranger123 commented on July 16, 2024

Hi Andrew, Thanks for the response. I wasn't able to reach the C experimental repo either. I'd be interested in taking a look.

I did take a look at migrating to Solr 4.x, but it looks like there are a few functions that have been deprecated that prevent the hashr from compiling. I did try using a version from another user that utilizes Maven to compile other versions, but it would only compile to 3.x.

When you mention an uninformative hash, could you help me understand what type of hash value would be uninformative?
Thanks.

from echoprint-server.

danicuki avatar danicuki commented on July 16, 2024

I am having the same issue here. Does anyone have a solution?

from echoprint-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.