Coder Social home page Coder Social logo

echoprint-server's Introduction

Please note, this code is now deprecated

Please see the latest at Spotify's Github

Server components for Echoprint.

Echoprint is an open source music fingerprint and resolving framework powered by the The Echo Nest. The code generator (library to convert PCM samples from a microphone or file into Echoprint codes) is MIT licensed and free for any use. The server component that stores and resolves queries is Apache licensed and free for any use. The data for resolving to millions of songs is free for any use provided any changes or additions are merged back to the community.

Read more about Echoprint here.

What is included

The Echoprint server is a custom component for Apache Solr to index Echoprint codes and hash times. In order to keep the index fast, the Echoprint codes are stored in a Tokyo Tyrant key/value store. We also include the python API layer code necessary to match tracks based on the response from the custom component as well as a demo (non-production) API meant to illustrate how to setup and run the Echoprint service.

Non-included requirements for the server:

Additional non-included requirements for the demo:

  • web.py

What's inside

API/ - python libraries for querying and ingesting into the Echoprint server
API/api.py - web.py sample API wrapper for evaluation
API/fp.py - main python module for Echoprint
API/solr.py - Solr's python module (with slight enhancements)

examples/lookup.py - an example fingerprint and lookup of a query

Hashr/ - java project for a custom solr field type to handle Echoprint data

solr/ - complete solr install with Hashr already in the right place and with the right schema and config to make it work.

util/ - Utilities for importing and evaluating Echoprint
util/fastingest.py - import codes into the database
util/bigeval.py - evaluate the search accuracy of the database

How to run the server

  1. Start the server like this (change your directory to where you have echoprint-server/solr/solr)

     cd echoprint-server/solr/solr
     java -Dsolr.solr.home=/home/path/to/echoprint-server/solr/solr/solr/ -Djava.awt.headless=true -jar start.jar
    

    If you run this server somewhere else other than localhost, update the pointer to it in fp.py:

     _fp_solr = solr.SolrConnection("http://localhost:8502/solr/fp")
    
  2. Start the Tokyo Tyrant server.

     ttservctl start
    

    Again, if the location of the TT server differs, update fp.py:

     _tyrant_address = ['localhost', 1978]
    

Running in Python

fp.py has all the methods you'll need.

>>> import fp
>>> fp.ingest({"track_id": "my_track_id", "fp": "123 40 123 60 123 80 123 90 123 110 123 130", "length": "120", "codever": "4.12"})
>>> fp.commit()
>>> r = fp.best_match_for_query("123 40 124 60 125 80 126 90 127 110 128 130 129 60 123 40 127 50")
>>> r.message()
'query code length is too small'
>>> example_code = "eJwty7kNADAMw8BVNILl-Mv-iwWCU11D0g_CQA-USIwoXNEg5YBH3o3-0sil7AHIrAyw"
>>> r = fp.best_match_for_query(example_code)
>>> r.message()
'OK (match type 3)'
>>> r.TRID
'my_track_id'

Running the example API server

  1. Run the api.py webserver as a test

     cd API
     python api.py 8080
    
  2. Ingest codes with http://localhost:8080/ingest:

    POST the following variables:

     fp_code : packed code from codegen
     track_id : if you want your own track_ids. If you don't give one we'll generate one.
     length : the length of the track in seconds
     codever : the version of the codegen
     artist : the artist of the track (optional)
     release : the release of the track (optional)
     track : the track name (optional)
    

    For example:

     curl http://localhost:8080/ingest -d "fp_code=eJx1W...&track_id=thisone&length=300&codever=4.12"
    
  3. Query with http://localhost:8080/query?fp_code=XXX

    POST or GET the following:

     fp_code : packed code from codegen
    

Generating and importing data

  1. Download and compile the echoprint-codegen

  2. Generate a list of files to fingerprint

     find /music -name "*.mp3" > music_to_ingest
    
  3. Generate fingerprint codes for your files

     ./echoprint-codegen -s < music_to_ingest > allcodes.json
    
  4. Ingest the generated json.

     python fastingest.py [-b] allcodes.json
    

    The -b flag creates a file named bigeval.json that can be used to evaluate the accuracy of the fingerprint and server (see below)

The fastingest script is very memory intensive. For large dump files you may run out of memory while processing them. If this is the case, then you can split the dumps into smaller chunks using the splitdata.py script:

python splitdata.py ~/Downloads/echoprint-dump*.json

This will create 5 new dump files, input-1.json, input-2.json, etc. Import as above with fastingest

Using the community data

Publicly available fingerprint data is available under the Echoprint Database License. If you want to use this data you can download it from http://echoprint.me/data/

Use the fastingest.py tool to import this data like above:

python fastingest.py [-b] ~/Downloads/echoprint-dump*.json

You can run fastingest many times on one or more machines, as long as you update the configuration information for solr and tokyo tyrant in fp.py

Evaluating fingerprint accuracy

We provide an evaluation tool, bigeval, that can be used to test the accuracy of the fingerprint and server.

Run bigeval.py without any arguments to get a usage statement. This command will test 1000 random files.

python bigeval.py -c 1000

For every 10 files tested, bigeval will print out a line that looks like this.

PR 0.0875 CAR 0.9125 FAR 0.0000 FRR 0.0875 {'tn': 0, 'err-api': 0, 'fp-a': 1, 'tp': 73, 'err-codegen': 0, 'fp-b': 0, 'err-data': 0, 'total': 80, 'fn': 6, 'err-munge': 0}

This is what the fields mean:

PR           "probability of error"  a weighted measure of the overall goodness of the FP
CAR          "correct accept rate"   probability that you will correctly identify a known song
FAR          "false accept rate"     probability that you will say a song is there that is not
FRR          "false reject rate"     probability that you will say a song is not there that is
err-api      API error               # of times the API had a timeout or error
err-data     data problem            # of times our datastore had an issue (missing data is the biggest culprit)
err-codegen  codegen fail            # of times codegen did not return properly with data
err-munge    munger err              # of times the munging process (downsampling, filtering, re-encoding etc) did not generate a playable file
fp-a         false pos A             we had a false positive where the wrong song was identified
fp-b         false pos B             we said a song was there that was not actually there
tp           true pos                correct song chosen
tn           true neg                song correctly identified as not there
fn           false neg               song there but we said it wasn't

If an error occurs during the matching, a message describing the error will be printed. Use the -p flag to print extra information about the scores obtained from solr when an error occurs to see how the server is choosing its winner. Use -1 file to test a single file and print its score information

A number of munge parameters are available to bigeval. These parameters alter the input file before generating a fingerprint, to simulate noisy signals. Run bigeval.py --help to see the available options. These options require mpg123 and ffmpeg to be installed.

You can test for true negatives by creating a list of tracks that you know are not in the database:

find /new_music -type f > new_music

Name the file new_music and put it in the same directory as bigeval.py.

Notes

  • You can run Echoprint in "local" mode which uses a python dict to store and index codes instead of Solr. You can store and index about 100K tracks in 1GB or so in practice using this mode. This is only useful for small scale testing. Each fp.py method takes an optional "local" kwarg.

echoprint-server's People

Contributors

alastair avatar bwhitman avatar jacobvosmaer avatar sophiebits avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

echoprint-server's Issues

Queries take tremendous amount of time (4-15 seconds) average 6 seconds

Hello,

We have implemented echoprint and ingested 600,000 songs (full length) into the database.
The Tokyo database size is 65GB and the SOLR database size is 20GB.
Unfortunately queries take a tremendous amount of time. We tried to optimize the solr database but the query time didn't improve.

Example query:

INFO: [fp] webapp=/solr path=/select params={echoParams=none&fl=track_id,score&q=909794+913+1003402+913+303386+913+584877+913+232554+913+956476+913+431834+950+679300+950+240931+950+955331+950+995773+950+357692+950+593061+976+70763+976+782224+976+782173+976+622726+976+601533+976+1011732+1027+796909+1027+763780+1027+566013+1027+753588+1027+312100+1027+193929+1051+316240+1051+332598+1051+19627+1051+692188+1051+430606+1051+281612+1164+845652+1164+397749+1164+200843+1164+468276+1164+764632+1164+527099+1181+731221+1181+394872+1181+947331+1181+160401+1181+800128+1181+1018565+1203+470445+1203+242926+1203+616082+1203+339907+1203+566977+1203+5230+1258+280182+1258+156696+1258+48596+1258+673917+1258+210986+1258+957346+1296+859097+1296+32268+1296+153048+1296+447624+1296+425653+1296+328097+1347+297986+1347+888504+1347+658455+1347+998338+1347+755650+1347+127671+1425+539061+1425+152124+1425+1932+1425+490734+1425+259543+1425+777147+1489+143020+1489+532966+1489+332011+1489+610997+1489+294746+1489+531412+1566+705565+1566+891476+1566+994477+1566+398781+1566+896153+1566+868163+1630+773449+1630+836516+1630+142472+1630+1010289+1630+163754+1630+684193+1682+803149+1682+657239+1682+732748+1682+732748+1682+732748+1682+94257+1770+732748+1770+732748+1770+732748+1770+732748+1770+732748+1770+327016+898+323107+898+1036138+898+636940+898+395567+898+275946+898+317417+926+376363+926+890914+926+705682+926+636726+926+202564+926+845159+950+728869+950+828839+950+94394+950+423738+950+701545+950+887760+1038+308181+1038+263679+1038+914857+1038+354865+1038+624759+1038+316618+1116+751629+1116+130128+1116+896612+1116+848012+1116+926584+1116+553706+1181+919921+1181+782385+1181+1037873+1181+800983+1181+920557+1181+528646+1258+712675+1258+37742+1258+204631+1258+708609+1258+803397+1258+601702+1348+821948+1348+279844+1348+46205+1348+933870+1348+770909+1348+959027+1412+857557+1412+948365+1412+325558+1412+411016+1412+269648+1412+442965+1578+949268+1578+1018985+1578+471072+1578+303080+1578+23059+1578+735935+1682+104536+1682+889609+1682+288667+1682+765904+1682+1047370+1682+387943+1770+36761+1770+269789+1770+823578+1770+528495+1770+274497+1770+398917+1849+720708+1849+444872+1849+24326+1849+347110+1849+554892+1849+883660+1989+300268+1989+112239+1989+1047381+1989+171935+1989+525106+1989+150300+2014+775632+2014+1039872+2014+66747+2014+66747+2014+66747+2014+1044341+2079+66747+2079+66747+2079+66747+2079+66747+2079+66747+2079+98556+898+746599+898+819520+898+625029+898+739693+898+514987+898+292281+950+923614+950+1016085+950+799882+950+8204+950+48660+950+317089+1027+933171+1027+496108+1027+83259+1027+426712+1027+8204+1027+727312+1103+76705+1103+565244+1103+229512+1103+276642+1103+621608+1103+402455+1131+993209+1131+142593+1131+664342+1131+591259+1131+826795+1131+294290+1181+474509+1181+2220+1181+276642+1181+115336+1181+312690+1181+841782+1258+617667+1258+163360+1258+790518+1258+833952+1258+217401+1258+420326+1411+581324+1411+933171+1411+973988+1411+678469+1411+142734+1411+689071+1437+901208+1437+163616+1437+401241+1437+529431+1437+919267+1437+801471+1488+235059+1488+614337+1488+670384+1488+446479+1488+685010+1488+493673+1566+947773+1566+533214+1566+252420+1566+742507+1566+625488+1566+581324+1604+457499+1604+617667+1604+177010+1604+628783+1604+115336+1604+819520+1630+260405+1630+384528+1630+986218+1630+811808+1630+183164+1630+292281+1758+83259+1758+8204+1758+747297+1758+790518+1758+158959+1758+933171+1835+83259+1835+590701+1835+747297+1835+237128+1835+504687+1835+623755+1912+1015745+1912+187178+1912+907455+1912+907455+1912+907455+1912+933171+1989+907455+1989+907455+1989+907455+1989+907455+1989+907455+1989+644707+898+355989+898+262809+898+901066+898+197301+898+846082+898+97489+950+809516+950+547829+950+206587+950+923296+950+361389+950+84536+1026+192110+1026+241444+1026+356019+1026+547829+1026+979165+1026+465926+1103+46194+1103+367139+1103+273667+1103+537071+1103+874329+1103+46673+1180+175512+1180+46194+1180+1036550+1180+315117+1180+803334+1180+839228+1202+349948+1202+88536+1202+262368+1202+668559+1202+36704+1202+648336+1258+308864+1258+10297+1258+536353+1258+918294+1258+54691+1258+874204+1335+272900+1335+97489+1335+171709+1335+356019+1335+547829+1335+646927+1378+351318+1378+1016527+1378+688694+1378+679940+1378+930878+1378+84536+1412+868276+1412+433288+1412+948093+1412+180168+1412+16059+1412+299279+1488+2843+1488+904122+1488+91558+1488+213518+1488+991955+1488+334613+1566+827953+1566+730944+1566+5278+1566+36280+1566+11265+1566+84536+1682+597540+1682+115352+1682+940566+1682+929096+1682+678580+1682+392847+1758+335563+1758+423247+1758+209740+1758+373538+1758+727531+1758+325153+1835+254323+1835+704166+1835+1026159+1835+1026159+1835+1026159+1835+931697+1875+1026159+1875+1026159+1875+1026159+1875+1026159+1875+1026159+1875+413450+899+827430+899+377816+899+342246+899+1001758+899+602740+899+546034+950+190166+950+1026816+950+997142+950+525328+950+684191+950+499217+1026+910321+1026+42130+1026+451494+1026+749446+1026+946414+1026+326065+1056+55811+1056+56244+1056+950756+1056+911180+1056+587607+1056+231252+1163+809979+1163+936140+1163+544586+1163+912950+1163+944283+1163+912546+1180+559888+1180+751143+1180+1026461+1180+52526+1180+676102+1180+599902+1258+264487+1258+129208+1258+338714+1258+206006+1258+351953+1258+536701+1283+98059+1283+31802+1283+925981+1283+418261+1283+224128+1283+25289+1349+870120+1349+495037+1349+172361+1349+492381+1349+425795+1349+744587+1413+61147+1413+777664+1413+1025932+1413+780273+1413+477990+1413+648544+1482+226590+1482+524066+1482+926283+1482+1017760+1482+932879+1482+373392+1566+45268+1566+650641+1566+30935+1566+577205+1566+274517+1566+345040+1605+200688+1605+440760+1605+49828+1605+189296+1605+605412+1605+633776+1631+658728+1631+988114+1631+609538+1631+191745+1631+969433+1631+958651+1758+379088+1758+656887+1758+704445+1758+381911+1758+736590+1758+134776+1791+643598+1791+284679+1791+1040459+1791+240343+1791+397235+1791+50740+1849+86300+1849+778393+1849+393888+1849+521884+1849+589014+1849+633526+1989+479631+1989+428713+1989+707586+1989+707586+1989+707586+1989+950298+2066+707586+2066+707586+2066+707586+2066+707586+2066+707586+2066+567370+898+856052+898+417830+898+607505+898+575141+898+1023879+898+198147+949+906970+949+852357+949+410162+949+519528+949+360064+949+840947+1026+86416+1026+852357+1026+410162+1026+519528+1026+821038+1026+937532+1103+580606+1103+419146+1103+417963+1103+1004729+1103+1046870+1103+937532+1180+994180+1180+788786+1180+417963+1180+1004729+1180+838045+1180+855512+1258+994180+1258+319803+1258+417963+1258+519528+1258+360064+1258+198147+1335+86416+1335+852357+1335+397752+1335+22433+1335+931971+1335+840947+1411+413342+1411+635013+1411+1021675+1411+11367+1411+862073+1411+402613+1488+233303+1488+21159+1488+520725+1488+145256+1488+913842+1488+887192+1566+998048+1566+328388+1566+589346+1566+526930+1566+182214+1566+74333+1629+423390+1629+417830+1629+475237+1629+502441+1629+56572+1629+198147+1682+410162+1682+519528+1682+555287+1682+45885+1682+519824+1682+86416+1758+410162+1758+1046870+1758+555287+1758+956708+1758+212352+1758+788786+1835+1004729+1835+821038+1835+755198+1835+755198+1835+755198+1835+840947+1989+755198+1989+755198+1989+755198+1989+755198+1989+755198+1989+522420+912+782348+912+743398+912+659944+912+532858+912+645434+912+418126+950+244147+950+480019+950+793161+950+485121+950+1019099+950+58179+1026+147904+1026+261231+1026+1004888+1026+212699+1026+55080+1026+55623+1050+46122+1050+157427+1050+736076+1050+531490+1050+272849+1050+217998+1103+352202+1103+817059+1103+466481+1103+648026+1103+599913+1103+285160+1142+769579+1142+98482+1142+274099+1142+592962+1142+915113+1142+817059+1180+310085+1180+993131+1180+316164+1180+653951+1180+48965+1180+899550+1258+627427+1258+447399+1258+834624+1258+484882+1258+727121+1258+189081+1335+94272+1335+481689+1335+345585+1335+1045346+1335+726652+1335+88655+1437+989447+1437+614883+1437+30805+1437+234430+1437+643247+1437+249221+1489+605851+1489+19438+1489+160094+1489+352315+1489+829851+1489+798933+1566+866675+1566+158715+1566+529925+1566+801105+1566+803628+1566+537603+1682+87298+1682+360185+1682+107280+1682+911687+1682+29681+1682+602499+1771+76392+1771+779655+1771+85794+1771+961288+1771+643938+1771+810763+1835+566764+1835+668780+1835+412470+1835+412470+1835+412470+1835+501111+1989+412470+1989+412470+1989+412470+1989+412470+1989+412470+1989+225288+898+110450+898+34699+898+377652+898+297841+898+727573+898+882726+950+607266+950+801232+950+84196+950+1004925+950+874779+950+233656+1026+607266+1026+292406+1026+84196+1026+126734+1026+401335+1026+581373+1103+946649+1103+292406+1103+932103+1103+386298+1103+685425+1103+581373+1181+406763+1181+986394+1181+713214+1181+172253+1181+211470+1181+866441+1258+778577+1258+781727+1258+713214+1258+721558+1258+874779+1258+1027650+1335+496259+1335+801232+1335+469418+1335+964605+1335+620965+1335+820042+1358+136096+1358+891277+1358+640040+1358+564206+1358+809061+1358+117152+1488+362914+1488+43642+1488+334647+1488+828162+1488+580511+1488+422394+1566+845017+1566+265278+1566+257711+1566+609593+1566+738605+1566+687628+1622+365286+1622+1017500+1622+767586+1622+178714+1622+377700+1622+310182+1682+233656+1682+634147+1682+607266+1682+781727+1682+960386+1682+921346+1758+1027650+1758+677987+1758+496259+1758+946649+1758+292406+1758+518035+1782+836578+1782+820042+1782+436089+1782+772516+1782+577434+1782+233656+1835+607266+1835+157632+1835+84196+1835+172253+1835+401335+1835+677987+1912+778577+1912+292406+1912+1038773+1912+718850+1912+643934+1912+233656+1989+696335+1989+8780+1989+809256+1989+809256+1989+809256+1989+409865+2066+809256+2066+809256+2066+809256+2066+809256+2066+809256+2066&qt=/hashq&wt=standard&rows=30&version=2.2} hits=5923078 status=0 QTime=6849

Are we doing anything wrong here?

Sincerely,
Daniel.

Error in fp.py

Hi all

I have been playing around in echoprint land and may have stumbled on a bug. I forget where I got the audio sample but it looks like there is a coding error in fp.py. Here is the error:

Traceback (most recent call last):
File "./test_break.py", line 26, in
reply = fp.best_match_for_query(fp_code, local=local)
File "/home/gomez/django/vidbip/r2d2/API/fp.py", line 233, in best_match_for_query
meta = metadata_for_track_id(trackid)
UnboundLocalError: local variable 'trackid' referenced before assignment

and here is the code I used to trip it (i used local but it also hold up on solr) (I was running this test one level above the APi directory)

import API.fp as fp
import re

local = True

fp_code = 'eJy1mEuSZKuORacEAiQxHBAw_yG8hWd1KswCb4TdjswzOR-OtH9ESsUsPUr3V5nPcsaj1DxfpTzLOK8S6VFaklcp5VXas0x7ldMfRct4FW-PYjW_isujfJn-8FeJ_iprPUqV-Sp3Ur-X5_S_TTC9ym3n72Xqo3yZYJuv8pyvcv_vxUp-lSqvEvYoRZ_ljRwfr_LG1R6P8ifk6H6Vt6qc-ShfUGf1Vd66ccn0a_mCOotX-cD2t2L7Vb5NP17lPf23p7w150_IWa9i-1W--FF_lG_Iaa_yxXH8Ub5ho7zKOY_yH2LjL_NVf5W_eMoXdr9c48uM3uwe-1X-MqO_MHSNV_nTBO1V3hxc-iittlf5i-_X_Shs-1X-ktx8vcp_yLLnjFZ6lFb1VZ5a92UK2l5llFeZ-1WeLHsn8z9N8MuMyqs0eZU_sSy_ynmVLxxc8Siq5VX-swm-GWrRfi1Vz7S9vFk5M06vpx3z00_0zYBPaeOzulOM8VkNs9o70511me7PbZ_VlTQFJtZDgbN-nhxlRY6y8zxLK0bUfY6xxTzWNvFC_B8z9pm6mwU2VdNa-bRYrZPwQhcPX0V78W3LLKSZ9dMHhwqfe_ctXbIdjUHyurvSNK0Vi-6yurfPrkJPSSFRtO1y2TRExjh-pDUVXpfitBJ7T40U98Z_753mPXkakQSZLKWsvbKcxqcH3Sg978rzNOd2_vXv6JruvaaYkazZTMK_6pi2euwkudg9o39W73Itqe7cNM8pvaB7MhBdnav2dNhImB7fcwmbGyv35muxR1cuZzy0pffR56ex__b8_1dr5PP7ar8tV_PCF_fMu3fl87T1ba7cuKszq8O2zjprtBE9hzi7MrdgaLKK7TZG7J56YRS92_IxcumW-CbdUav0VlRHm0Mx33Zay2ejtMa1uWyv56gUqSBu1WNHQrRUoDTr9nBPW5jfzpUnmw1tJctSWy3ryr53LsfKOtLrjCm2aVjqIjXrxasXgFcLw3G5oF_gYEkCOjTiXg7gd7BTVWBJVGKdYbc0TzNasWRk3lAk8mzTxf1SLgZHNXaWIwvA7G7gRGYGiMdRih6F2bLfNvru04GetjXuo3rNncZeBnjSUKbcyjAuOwRxOUPSFPpJr0dWkZ5yy3zenGtmH7pXsBH1uLziWQCbmYHklcHoKIuB2ZCpuQBILy2BT4Ub8WM1neP26-pxROtIDddo85wOGX2XGC6apNQSs07RzM_e0kl9pyxWD7w-l28rZ3Z4PbWPKbzqwlf74Nd9gPzfOPSiq3ieUqF-3qPnGo0u52Hs_9-q7UtOA2CowtqDByVp3jX72fTcKdBjA7h88qJPbatDumJdKlq0yATABVCDKr0JH0ygIArgx9lhpxTGNHXk0RHFJcXSyDVl0znYQ4a95eyrl53faVtibFYXnSk2mW7d9MG4IJ_Pk9EdAJAcxk5p9Lci40fuNpoWzcY275MnBEE8FNqzrzbgVb1_juHbyqa5ucMT2nZ3v_Toqd0z9PUBc1DFMACu0xHFm5p2v9xzNDKXurudA4APe6rG-zg0ic28RgaYiNPqBYZmqPpj1V3b-XVVkFskMHlewRSutzH06i0O2lBOn8Nnc6ACeu5fgHxJFcZTc6tpbByjHskTedExR01zFzgx2C907hDjXLKfLbPMfv-ykTe-AdPKUmT_3gE_wWLtIG3xklbX2sxalsWaEx1yRj_SAh0wbdIDiGx2RhSZTLPVvJmrRuFDkbGSVr3CzNgqrZddBAgEl_cIFBjprXQD-oMAq7QfwMVAUUHinbnTzbZP9YW0sO8aC2FIeOJGda-9TYQL_zrEyJIP8orAFFSqy4RefMUB5VWuQXWmG5Wm8i14Vfo8r2ClMQRRnUwZCpaNI1qt6BySi4Ccz23rGjhKgYHd2ypqoX0hlhypY6WVELNydbWLIjgJCVlyjKHR3BELz5axhollQ33Wj9WUO6bz26qxhbMddcWgMT2o7AnHhZJQOdKZZzrBkg-eA8EwhSqL9IwqgMqty7bxlT5lGIYTvd0_1EJMWgaBUaiJqDamOROk2PQsXzW95j7kE1_ATYZMPANGjG0ADwv8UOkgVUzbGntk-3KF6grOSI43jYKOdkSJ4EHQkMDz8P3m6fPeAk7ts6vPxYE8I529q86oc4OFq4Mj9YVWy5BMXjgIWYBYdJ6dsYn4PJ7gA248p_N5XsnlMut-ETf0SWLqwWsBnV2T4UU6oTgZInKSAUBFJrtkqeLU_mP189BfV4_e2JXipp-qpYwF1lK_v_yqdnMTduCf6zJuWbz1uxv0su288iQJxee25H1eJI5FaLA26RMd052HO8IGraxsZnvu5Qg24OD9CMKewIMAim0Zzit3zKBZD6ZLgNQLFRdUa9G4lq65H6ZEzOR2QEeGI-K1jReCdwxizZZn7odUuuae-5AiSFYLJZVcr5Vz6iA2Ei9oCWkjk0VINolgU3HyA9EmWgAt_JoY8dUSulHXQThuVgHCRE6EmVuRfTVNAx8ovQXQGtoR1DqQBjov92ANWK6lkT7YTb85oaarNJkABYMuN72DfDqu9cfq2KjFr6sdO6UTxFa8WRJXK6nK92GlkRVAvi3mw7kytzQQY0jDaHgKmaDadIiI5C3ZEQ0U8iyCaYU3WGJtRpZAnCZydqE8DT0Bz1ADg0X_9t4OWQru6VmzamA7RMQxcaKMfQlb814QCNZIjy28XNMgaOUgWgEkTxMVx0y4D5WMK5gb_5xeF_o78A9EtRGFJ8KB_o89cxwbZWOM17rJP2yYbILJ4VMnV1K4CSErIQjW4dEH-qGllb0GsZpZ4Iv0ynRIgOi-7xRvrxiIbckbwGBfxNlJGgXbnH6sjsGjYfLENRFPHDU53wB6FTG4LtXSj1W_IvTrqmKHiRxh6BAHFMlAEMsBkEpiSW3feAaSFLODO613qMJOSN1ZS0cx2DDRCllyHatwPSI1rtiuFvxP0LsgeWOwmWhg6VKWKUEimKEEnKPIkKFfMB1Jpec2YWe5gfIebCZ-diNEP0bywEhbQZPTvGm3EbJBp974CeqUQwFGvpk_kYfNIv08YJBYs3P46zfKE2b3Nd7L7WEfj1wHc9V-kVGJAxnXr8Q2UkXF-BA4FJyPE9yEZyMDN23hZ5z7UNIQHCcwjbyDlLLb4ihE7Nu0LDBo3ojoYwYE79luz8AtpwolMAPeDleII7jpECIA_Jqc-IDOuPgkxH3ulc4JTzgrd1lAlFONMR_UjMMix7PUGttCeX6stmtuv67-D1yy0jk='
codever = 4.1
length = 300

reply = fp.best_match_for_query(fp_code, local=local)

if re.match('[A-Za-z\/\+\_\-]', fp_code) is not None:
    code_string = fp.decode_code_string(fp_code)
    if code_string is None:
        print 'WARNING'

track_id = fp.new_track_id()
data = {"track_id": track_id, 
        "fp": code_string,
        "length": length,
        "codever": codever }

fp.ingest(data, do_commit=True, local=local)

reply = fp.best_match_for_query(fp_code, local=local)

I could be wrong but I think this line:

meta = metadata_for_track_id(trackid)

needs to be changed to this line:

meta = metadata_for_track_id(top_track_id)

problem with best_match_for_query

Dear all,
Can someone explains me what I am doing wrong?

I am trying to make my own database of fingerprint, on which I want to evaluate the audio identification performance with some specific degradation of the signal.

I have an audio file named : audiofile
I compute the fingerprint of this file using the entire song:

           res = song.util.codegen(audiofile, start = -1, duration = -1)

using the function contained in dedup by lamere I extract trid, raw_code and ingest_data using:

            trid, raw_code, ingest_data = dedup.parse_json_block(res) 

then I ingest the new data:

            fp.ingest([ingest_data], do_commit = True, local = True)

and I add the new song to my dictionary named done:

            done[audiofile] = trid

Now I want to make a test and identify the same song (at a first step), what I really want to do is to retrieve the song given a short excerpt of the song. I do not manage to do that.

When I call :

querykey = song.util.codegen(audioexpert,start = -1, duration = -1)    
trid, raw_code, ingest_data =  dedup.parse_json_block(querykey)
response = fp.best_match_for_query(raw_code, local = True)

I have

response.match() = False :(

What is wrong with what I am doing? I am not even trying to recognize a modified version of the song.

I have a last question, If I compute the codegen using the entire song, is the system suppose to identify the song when I query a short excerpt of the song?

Thank you for your help,
Lila

Get Metadata from local echoprint-server

Hi,

I am using echoprint with a local echoprint-server and I am doing these steps:

  1. ./echoprint-codegen SympathyForTheDevil.wav > wav/SympathyForTheDevil.json

The json file is like this:
[
{"metadata":{"artist":"The Rolling Stoner", "release":"", "title":"Sympathy for the Devil", "genre":"Rock", "bitrate":1411,"sample_rate":44100, "duration":356, "filename":"SympathyForTheDevil.wav", "samples_decoded":3924992, "given_duration":0, "start_offset":0, "version":4.12, "codegen_time":1.072712, "decode_time":2.966706}, "code_count":9030, "code":"eJzcvQ.....", "tag":0}
]

  1. In my folder /echoprint-server/util: python fastingest.py /home/eugenio/echoprint/echoprint-codegen-realtime/wav/SympathyForTheDevil.json

  2. With chrome Advanced Rest Client, I am running this query: "http://localhost:9090/query?fp_code=eJzcvQu...."

And I am receveing this result:

{
total_time: 1365
score: 1575
ok: true
query: "eJzcvQu....."
message: "OK (match type 6)"
qtime: 98
match: true
track_id: "TRVDZVT15013A15382"
}

What should I do to get even metadata from my local server?

Error in fp.py

in current build I am getting

Traceback (most recent call last):
File "./lookup.py", line 21, in ?
import fp
File "/usr/lib/echoserver/API/fp.py", line 115
with solr.pooled_connection(_fp_solr) as host:

when I look at _fp_solr = solr.SolrConnectionPool("http://localhost:8502/solr/fp")

HTTP ERROR: 404
NOT_FOUND
RequestURI=/solr/fp/

but I can go to "http://localhost:8502/solr/fp/admin"

Query example app

There should be an example lookup program like in the codegen project that takes a file, FPs it, and looks up in the server.

Ingesting the same song twice won't return results at lookup

Has anyone tried ingesting the same mp3 file twice ?

When I ingest it once, if I query for a fragment of it, it returns the correct track id

When I ingest it again, and use the fp utility for querying, it won't return any results.

I tried to do some debugging on it and noticed that in return for the query, the script receives both track ids, but at some point near the end, it discards them both and returns no result.

Anyone knows what's going on?

example in docs doesn't work b/c of new fields

>>> fp.ingest({"my_track_id":"123 40 123 60 123 80 123 90 123 110 123 130"})
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "fp.py", line 497, in ingest
   raise Exception("Missing required fingerprint parameters
(track_id, fp, length, codever")

Periodic and seemingly random errors returned from TokyoTyrant on misc getlist commands

At random points in time we are getting errors back from tyrant on an echoprint query. The error codes include: 49 54, 32, 57, 53, 51 and others. Also here is the stack trace that accompanies the errors:

Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 239, in process
return self.handle()
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 230, in handle
return self._delegate(fn, self.fvars, args)
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 420, in _delegate
return handle_class(cls)
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 396, in handle_class
return tocall(*args)
File "/usr/local/echoprint/API/api.py", line 83, in GET
response = fp.best_match_for_query(stuff.fp_code)
File "/usr/local/echoprint/API/fp.py", line 194, in best_match_for_query
tcodes = get_tyrant().multi_get(trackids)
File "/usr/local/echoprint/API/pytyrant.py", line 292, in multi_get
rval = self.t.misc("getlist", opts, keys)
File "/usr/local/echoprint/API/pytyrant.py", line 540, in misc
return list(self._misc(func, opts, args))
File "/usr/local/echoprint/API/pytyrant.py", line 524, in _misc
socksuccess(self.sock)
File "/usr/local/echoprint/API/pytyrant.py", line 172, in socksuccess
raise TyrantError(fail_code)
TyrantError: 57

Restarting the api.py process temporarily makes the errors go away. Does anyone have an insights into what is going on here?

Execute a backup

It's possible to create a backup of entire system (fingerprint, solr db, etc)?

Return metadata from best_match_for_query

We currently return a response object with a TrackID but no other metadata. If we're storing it, we should return it.
Also, provide a method to get this given a track id

Bug at Line 482 in fp.py

I think the number 43.42 at this line is supposed to be 23.2 because the IOI is quantized to 23.2 ms. As it is right now, the code will not detect correctly which sub track id (track_id_i) has the highest score.

60 seconds segment length issue

Hi all,

I'm trying to understand the source code of fp.py file and I think you are using not consistent formulas to compute segments of 60 seconds in time units.

In the first case, we can retrieve the sixty seconds segment length by multiplying 60 seconds with 43.45 as shown in line 134:

cut_code_string_length(code_string)
    """ Remove all codes from a codestring that are > 60 seconds in length.
        Because we can only match 60 sec, everything else is unnecessary """
    sixty_seconds = int(60.0 * 43.45 + first_timestamp)

But in the second case (see line 482) we use another computation to get the same parameter (segmentlength) :

    def split_codes(fp):
    """ Split a codestring into a list of codestrings. Each string contains
        at most 60 seconds of codes, and codes overlap every 30 seconds. Given a
        track id, return track ids of the form trid-0, trid-1, trid-2, etc. """

    # Convert seconds into time units
    segmentlength = 60 * 1000.0 / 43.45

Maybe I am wrong but I think the second formula is not correct. The right formula should be:

    segmentlength = 60 * 43.45

If I am wrong please explain me the meaning of the second formula, otherwise please fix this issue.
Thank you very much.

the 5 fields in schema

the schema needs to have

  • track_id
  • fp_code
  • artist_name
  • track_title
  • release_title

right now it just has the first two. and the example API should post those to the db

Ingest live stream from remote server?

Hi, I was wondering if it's possible to use echoprint with a remote mp3 internet radio stream or on the radio server itself with an audio card?

Thanks so much,
Matt

Problem Accuracy echoprint-server

Hi Team,

Well, I've installed the echoprint-server in ubuntu but I am facing a problem with the accuracy recogniction.

For example i've downladed the following sound:
http://www.youtube.com/watch?v=_-0MXklxHlQ in 11025Hz, Mono, 16 bits. I've generated a echoprint codegen (40 secs) and then I uploaded it succesfully to my echoprint server.

After that, I download a new sound (another video but with the same sound):
http://www.youtube.com/watch?v=XtS22a-UrA8 also in 11025Hz, Mono, 16 bits. Then I've generated a echoprint codegen (40 secs) and then I sent a GET request to my server, but it doesnt match :(

Then, I made a last test. I've generated an ENMFP codegen with the second sound (about 70 secs) and I sent the fp_code to Echonest API Server and they detected the artist and the track name.

So How can I improve the accuracy?, I want to implement a echoprint-server and recognize sounds.

Could you please shed some light here?.

Thanks in advance

An exception while integrating Hashr.jar into a collection in Solr

When I'm trying to add Hashr.jar into a collection's solrconfig.xml, the server gives the following warning:

3200667 [qtp16504854-12] WARN org.eclipse.jetty.servlet.ServletHandler – Error for /solr/admin/cores
java.lang.VerifyError: class com.echonest.knowledge.hashr.HashAnalyzer overrides final method tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;

How can I fix this error?

Thanks in advance.

what's reason aboat the socket.error?

(project) [holens@localhost API]$ python ../util/fastingest.py allcodes.json
1/1 allcodes.json
二月 05, 2017 6:02:41 下午 org.apache.solr.update.processor.LogUpdateProcessor finish
信息: {add=[TRPMQAC15A0DB9C136-0, TRPMQAC15A0DB9C136-1, TRPMQAC15A0DB9C136-2, TRPMQAC15A0DB9C136-3, TRPMQAC15A0DB9C136-4, TRPMQAC15A0DB9C136-5, TRQZOTB15A0DB9C137-0, TRQZOTB15A0DB9C137-1, ... (8 added)]} 0 73
二月 05, 2017 6:02:41 下午 org.apache.solr.core.SolrCore execute
信息: [fp] webapp=/solr path=/update params={} status=0 QTime=73
Traceback (most recent call last):
File "../util/fastingest.py", line 63, in
fp.ingest(codes, do_commit=False)
File "../API/fp.py", line 586, in ingest
get_tyrant().multi_set(codes)
File "../API/fp.py", line 318, in get_tyrant
_tyrant = pytyrant.PyTyrant.open(*_tyrant_address)
File "../API/pytyrant.py", line 206, in open
return cls(Tyrant.open(*args, **kw))
File "../API/pytyrant.py", line 346, in open
sock.connect((host, port))
File "/home/holens/.pyenv/versions/2.7.12/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 111] Connection refused

missing jetty? add it to dependencies in the readme?

using the line given in the readme to start the server e.g.

java -Dsolr.solr.home=/home/path/to/echoprint-server/solr/solr/solr/ -Djava.awt.headless=true -jar start.jar

Gives me: Unable to access jarfile start.jar

Changing the line to

java -Dsolr.solr.home=/home/path/to/echoprint-server/solr/solr/solr/ -Djava.awt.headless=true -jar solr/solr/start.jar

results in this exception:

java.lang.ClassNotFoundException: org.mortbay.xml.XmlConfiguration
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
    at org.mortbay.start.Main.invokeMain(Main.java:166)
    at org.mortbay.start.Main.start(Main.java:497)
    at org.mortbay.start.Main.main(Main.java:115)

This seems to indicate that Jetty is not in my classpath. I can't find any evidence that it is contained within the echoprint-server repo, so it seems like at least it should be added to the list of dependencies in the readme

Socket Error

Hi, I am getting the following error. Please help.

class 'socket.error' at /ingest

[Errno 32] Broken pipe

Python /usr/lib/python2.7/socket.py in meth, line 223
Web POST http://localhost:8080/ingest

How to start ttservctl?

Hi,

I was able to start the echoprint server, but got stuck to starting
ttservctl

Is it not in the githup checkout, do I need to install it separately?

API Leak..

We are hosting a server on Ubuntu. It has a small number of entries(<1000), but we run about 100 queries/minute. After about a week of working well, failures start to occur in the API. It complains about too many open files and stops working. After a reboot, it will work for another week.
Anybody else experience this? This is very consistent behavior for us.

track_id with spaces can't be looked up in fp.metadata_for_track_id and fp.delete

I know the obvious workaround is to not have spaces in your track_ids, but I found two places in fp.py where solr is queried for track_id, and the input string it isn't quoted.

Line 116:
response = host.query("track_id:%s" % track_id)
could be changed to:
response = host.query("track_id:\"%s\"" % track_id)

Line 453:
host.delete_query("track_id:%s*" % t)
could be changed to:
host.delete_query("track_id:\"%s*\"" % t)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.