Coder Social home page Coder Social logo

Comments (7)

alnesbit avatar alnesbit commented on July 16, 2024

Hi Lila,

How long is the query audio? Can you please send me the files audiofile and audioexpert? I will then try to reproduce your issue and see what the problem is.

Thanks,

Andrew

from echoprint-server.

lilila avatar lilila commented on July 16, 2024

Dear Andrew,
Sorry for the delay. I guess my error was similar to the error mentioned in issue #22.

I still have a few questions for you:
When I compute the fingerprint of a file with duration > 15 sec, the file is segmented into multiples parts. According to the description of split_codes(fp) (from fp.py), the codes are supposed to overlap every 30 sec. From my experiments the hop size is rather 15 sec than 30. Is this right?

Also, still in fp.py, can you explain me what is the aim of the variable "slop" in the function "actual_matches"?

I have also performed some evaluations on the matching of two fingerprints computed on the same audio file with different "start" values. First, I have noticed that the matching is very low when the fingerprints are computed with different "start" values. Is there a way to improve this? I understand that the values can hardly coincide at the beginning and the end of the file, but I find strange that the fingerprints do not match better in the middle of the file. The only way I have found to increase the recognition rate is the store in the database many fingerprints of the same audiofile starting at different instants :(

Also, I have tried to change the start values by a very small time lapse (2ms) to see if it was possible to get a better matching. I thought the low matching could came from the fact that the moving windows used during the fingerprint process of the reference and the query audio files were not aligned. Then I found that there is no difference for " i < start < i+1 ", where i is an integer value. In other words, the float values for "start" are not recognized. Is this normal? Am I doing something silly again?

Thanks for your help,
Best regards,
Lila

from echoprint-server.

alnesbit avatar alnesbit commented on July 16, 2024

Hi Lila,

When I compute the fingerprint of a file with duration > 15 sec, the file is segmented into multiples parts. According to the description of split_codes(fp) (from fp.py), the codes are supposed to overlap every 30 sec. From my experiments the hop size is rather 15 sec than 30. Is this right?

I fixed a bug in split_codes() a few days ago which addresses a related issue, so hopefully this should be fixed now. Can you please try again? The segments should be 60 segments in length, with overlap of 30 seconds.

Also, still in fp.py, can you explain me what is the aim of the variable "slop" in the function "actual_matches"?

This reduces the resolution of the time codes to reduce the sensitivity to timing jitter when time aligning between query and fingerprint. It is a trade off between sensitivity and timing jitter. (Also, see below.)

I have also performed some evaluations on the matching of two fingerprints computed on the same audio file with different "start" values. First, I have noticed that the matching is very low when the fingerprints are computed with different "start" values. Is there a way to improve this? I understand that the values can hardly coincide at the beginning and the end of the file, but I find strange that the fingerprints do not match better in the middle of the file. The only way I have found to increase the recognition rate is the store in the database many fingerprints of the same audiofile starting at different instants :(

One reason might be that the codegen takes a little while to warm up, so you need to let it run for long enough - right now this is at least 20 seconds, but we're working on getting this down. What sorts of accuracy rates are you seeing in the various cases? And what length of fingerprints are you using?

Also, I have tried to change the start values by a very small time lapse (2ms) to see if it was possible to get a better matching. I thought the low matching could came from the fact that the moving windows used during the fingerprint process of the reference and the query audio files were not aligned.

How are you shifting the start values? Are you doing this to the audio file at the signal level, or are you adjusting the time codes after the fingerprint has been generated? The absolute values of the time codes shouldn't really matter too much. The important thing is that the shifts between the query and database fingerprint time codes are consistent, i.e., it's the relative time shifts which are important to get right. This is where the slop factor above comes in; it makes the relative distances between time codes more "sloppy" so that the differences between time codes in query and database fingerprints match to a greater extent.

Then I found that there is no difference for " i < start < i+1 ", where i is an integer value. In other words, the float values for "start" are not recognized. Is this normal? Am I doing something silly again?

I'm not sure what you mean here. What is start?

Best,

Andrew

from echoprint-server.

lilila avatar lilila commented on July 16, 2024

Thank you for your answers

I fixed a bug in split_codes() a few days ago which addresses a related issue, so hopefully this should be fixed now. Can you please try again? The segments should be 60 segments in length, with overlap of 30 seconds.

--> So you change the denominator in segmentlength = 60 * 1000.0 / 23.2,

What sorts of accuracy rates are you seeing in the various cases? And what length of fingerprints are you using?

--> I was trying to reduce as much as possible the length of the query (from 30 sec to 5 sec). The results are the following: (duration in sec | accuracy: percentage of audio excerpt correctly identified)
5 sec : 8%
10 sec: 65%
15 sec: 82%
20 sec: 85%
25 sec: 87%
30 sec: 87%
I should mentioned that these results were obtained using a small data set made of radio broadcast recordings.
For information, what I am trying to do is to identify the radio channel someone is listening to. I am also trying to use fingerprinting to align some audio recordings. To do so I use the values computed in actual_match to determine the delay between the query and the reference.

Concerning the time alignment, I noticed something weird: there is a time-lag that increases linearly while the start value increases. (Just to make sure, what I call "start value" is the second argument of song.util.codegen(audioseg, start = 0, duration = 30)).
Sorry if it is not very clear I will try my best to explain this problem. We consider I have an audiofile with length 30 min, this file is fingerprinted and ingested in the data set as a reference. Now I have a few queries, that are basically excerpts of this long file.
If the query has a small "start value" (i.e from 0 to 5 sec approximately) then the time alignment is perfect, when the start value increases (5<start<10) then there is a time-lag between the true start and the estimated start of about 2 sec and so on. At the end of the file I have a time lag of almost 30 sec ! The values given here are of course not exact (the time-lag increases linearly !! ), I gave them just to make the explanation a little bit clearer ( I guess it is still confusing ??? :-/ )
I found a way to avoid this problem (use short segments of the long audiofile instead of the whole file), however it is not a solution. I was wondering if the problem could come from the fact that you use x = 23.2 instead of 256/11025*1000? Anyway it is just to let you know that this problem exists.

I'm not sure what you mean here. What is start?
-> As I saied above, "start " is the second parameters of the song.util.codegen function.

Thank you for your time
lila

from echoprint-server.

abuharsky avatar abuharsky commented on July 16, 2024

Hi Lila,
what are your success in using echoprint for now?

from echoprint-server.

lilila avatar lilila commented on July 16, 2024

It has been a while I haven't used it but it used to work perfectly

from echoprint-server.

picozone avatar picozone commented on July 16, 2024

Hi Lila,
How did you done for the twice ingestions problem ? How can I start up again for brand new database ?
Thank you for your time,

from echoprint-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.