mjhydri / beatnet Goto Github PK

BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).

License: Creative Commons Attribution 4.0 International

Python 100.00%

beatnet real-time real-time-beat-tracker real-time-downbeat-tracker real-time-tempo dnn-beat-tracking particle-filtering pytorch meter-detection crnn-network

beatnet's People

Contributors

Stargazers

Watchers

Forkers

chenchy richieliuse cwitkowitz davispolito nicolasanjoran ruohoruotsi jaedukseo tanish-g karen-pal markyouyuren nilp0inter joe-noel-dev alejandrodl rlleshi liuguoyou antoniaelsen cyberflamego maxmax2016 qiutester aphillipo notnanton danigb alindsay55661 jlqian98 ianloic davies-w robaerd runngezhang pashakhomchenko e7mac h0han xinleiren gussitc not-matt hhlcorpusant carlthome robincools itsbrex phdapps okio-ai ice6 yocontra stephandue

beatnet's Issues

Invalid device string: 'cuda:cpu' error: bug in model.py

Hey, thanks for your sharing your work. It is very appreciated!

After a clean installation I wasn't able to run the example code provided. I tried digging around the code to see if I could fix it for you.

The error I kept getting was

RuntimeError: Invalid device string: 'cuda:cpu'

This code is the problem:

I changed it to this:

and I got it working

Here is the code.

    def change_device(self, device=None):
        """
        Change the device and load the model onto the new device.

        Parameters
        ----------
        device : string or None, optional (default None)
          Device to load model onto
        """
        if device is None:
            # If the function is called without a device, use the current device
            device = self.device
        elif not torch.cuda.is_available():
            device = torch.device('cpu')
        else:
        # Create the appropriate device object
            device = torch.device(f'cuda:{device}')

        # Change device field
        self.device = device
        # Load the transcription model onto the device
        self.to(self.device)

I could clean the code a little and open a pull request with this change, if you are open to that. In any case, maybe this issue will help others to make your great project work on their computer.

Cheers.

Restrictive Numba dependency makes Numpy type hints non-descriptive

Would it be possible now to update the dependency of Numba to a later version that supports numpy 1.23? Without it, type hints of ndarray is limited to just that, ndarray, when IDEs infer variable types. The later versions allow inferring ndarray[shape, type] such as ndarray[(Any, 3), float] which is very useful for code readability.

I'm aware of the restriction from librosa that was commented back in 2021, so just wondering if theit misleading (fake) support was fixed?

Feel free to close this if it's still not possible. Thanks.

numpy error!

ValueError: numpy.ndarray size changed, may indicate binary incompatibility.
Expected 96 from C header, got 88 from PyObject!

Make pyaudio optional?

Noticed that pyaudio is a required installation even when not using audio streaming, because it's imported in the top-level BeatNet.py module. Could this be relaxed to an optional dependency?

Can you please provide the annotation using which you have validated the model as I am not able to reproduce rest on GTZAN dataset?

Can you provide Rock Corpus Annotation and Ballroom Annotations also?

M1 Mac Support?

I've been wondering if anyone has got this awesome looking package working on M1 - I cannot for the life of me figure out which versions of numba and llvm to use. Brew doesn't let you install a version of llvm compatible with the versions of numba specified.

I wonder if I should checkout the repo, change some of the version requirements and cross my fingers 😂

Alternatively is this possible inside docker - will portaudio work inside a docker container?

Could you please provide a complete code of training?

I want to do further training on my dataset, but there is currently no training code in the library.

Not speeding up inference on using CUDA/GPU

hi @mjhydri / @karen-pal / @rlleshi ,

I am able to run the inference on GPU. It is successfully running but the speed is not faster than on CPU. It is slower at times as well.

Config/Setup-
GPU - V100 16GB

Code used -

from BeatNet.BeatNet import BeatNet

estimator = BeatNet(1, mode='online', inference_model='PF', plot=['activations'], thread=False, device='cuda')

Output = estimator.process("audio file directory")

Also tried -

from BeatNet.BeatNet import BeatNet

estimator = BeatNet(1, mode='online', inference_model='PF', plot=['activations'], thread=False, device='cuda:0')

Output = estimator.process("audio file directory")

Note that I checked whether torch is recognizing the GPU or not. It is recognizing the GPU. Also, GPU memory consumption varies from 950 MB - 1350 MB during inference.

Help install BeatNet

Hi guys, I'm trying to install this but can't get it running.

This is the step:

OS: Debian AMD64
Python: 3.8

Step:

Create virtualenv with 'virtualenv env'
Activate the virtualenv 'source env/bin/activate'
Install cython 'pip install cython'
Install beatnet 'pip install beatnet' => Success
Create test.py

//test.py content
from BeatNet.BeatNet import BeatNet

Run python test.py and got those error

python test.py
Traceback (most recent call last):
  File "test.py", line 1, in <module>
    from BeatNet.BeatNet import BeatNet
  File "/home/user1/Documents/python/beatnet2/env/lib/python3.8/site-packages/BeatNet/BeatNet.py", line 8, in <module>
    from madmom.features import DBNDownBeatTrackingProcessor
  File "/home/user1/Documents/python/beatnet2/env/lib/python3.8/site-packages/madmom/__init__.py", line 24, in <module>
    from . import audio, evaluation, features, io, ml, models, processors, utils
  File "/home/user1/Documents/python/beatnet2/env/lib/python3.8/site-packages/madmom/audio/__init__.py", line 27, in <module>
    from . import comb_filters, filters, signal, spectrogram, stft
  File "madmom/audio/comb_filters.pyx", line 1, in init madmom.audio.comb_filters
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Thank you.

Confused about the result.

sry, i am not clear about the meaning of second column. What does 1. and 2. refer to ?

thanks

why BeatNet SOTA model?

in the "Temporal convolutional networks for musical audio beat tracking" of 2019, the published F-measure in ballroom、GTZAN are all better than BeatNet, so why BeatNet claimed SOTA?
is there something I missunderstand?

Issues regarding the particle filtering model

Hi,

Many thanks for this cool work!
I have two questions regarding the particle filtering (PF) model:

The model does not produce same results for the same activation functions of a same track. In the attached jupyter notebook (pf-repeat-issue.ipynb), I run PF on a same activation function for five times, and get different results.
The model does not work on `ideal activation function'. In the pf-groundtruth-issue.ipynb, I generate an ideal activation function using beat annotations, which would only have peaks at beat positions. But the PF generate very low Recall for that.

The notebooks are shared via google drive: https://drive.google.com/drive/folders/1_H8u847bVnUP7Lfome8WuO98FNaU4Jew?usp=sharing

Are these issues expected because of the sampling process of PF? Or, is there any way we may avoid/alleviate these issues? Also, is there any idea regarding the variance/std of the PF performance under different conditions (e.g., genres)?
Just want to make sure I didn't use your model wrong. Thank you!

I have audio data, how do I call this?

I don't see any examples of how to handle if I have a 2D numpy array of amplitudes? Everything requires me to write it to a file first, which seems under optimal.
How do I "read" the output, IE what's in it? I watched the video, and while it's very cool, that's not something I can actually understand. I'm assuming that there'd be something that says 0 secs -> x seconds, 120 bpm, 4/4 time signature, or something along those lines.

Low-quality audio makes detection worse

Hey, thanks for your sharing. It is a great work.

I found that when I use 16000Hz audio I get worse results than 22050Hz.（audio from the same music）
Inputs are all automatically resampled to 22050 Hz.
How can I do better when I only have low quality audio of 16000Hz.

Can't install on Ubuntu 20.04.1 - the version of numba required by the package can't be found by pip (0.54.1)

Hey there!
Here's the error when trying to install with pip, either from pip install beatnet or by running pip install .:

ERROR: Ignored the following versions that require a different python version: 0.52.0 Requires-Python >=3.6,<3.9; 0.52.0rc3 Requires-Python >=3.6,<3.9; 0.53.0 Requires-Python >=3.6,<3.10; 0.53.0rc1.post1 Requires-Python >=3.6,<3.10; 0.53.0rc2 Requires-Python >=3.6,<3.10; 0.53.0rc3 Requires-Python >=3.6,<3.10; 0.53.1 Requires-Python >=3.6,<3.10; 0.54.0 Requires-Python >=3.7,<3.10; 0.54.0rc2 Requires-Python >=3.7,<3.10; 0.54.0rc3 Requires-Python >=3.7,<3.10; 0.54.1 Requires-Python >=3.7,<3.10
ERROR: Could not find a version that satisfies the requirement numba==0.54.1 (from beatnet) (from versions: 0.1, 0.2, 0.3, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.7.2, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.10.1, 0.11.0, 0.12.0, 0.12.1, 0.12.2, 0.13.0, 0.13.2, 0.13.3, 0.13.4, 0.14.0, 0.15.1, 0.16.0, 0.17.0, 0.18.1, 0.18.2, 0.19.1, 0.19.2, 0.20.0, 0.21.0, 0.22.0, 0.22.1, 0.23.0, 0.23.1, 0.24.0, 0.25.0, 0.26.0, 0.27.0, 0.28.1, 0.29.0, 0.30.0, 0.30.1, 0.31.0, 0.32.0, 0.33.0, 0.34.0, 0.35.0, 0.36.1, 0.36.2, 0.37.0, 0.38.0, 0.38.1, 0.39.0, 0.40.0, 0.40.1, 0.41.0, 0.42.0, 0.42.1, 0.43.0, 0.43.1, 0.44.0, 0.44.1, 0.45.0, 0.45.1, 0.46.0, 0.47.0, 0.48.0, 0.49.0, 0.49.1rc1, 0.49.1, 0.50.0rc1, 0.50.0, 0.50.1, 0.51.0rc1, 0.51.0, 0.51.1, 0.51.2, 0.52.0rc2, 0.55.0rc1, 0.55.0, 0.55.1, 0.55.2, 0.56.0rc1, 0.56.0, 0.56.2, 0.56.3, 0.56.4, 0.57.0rc1, 0.57.0, 0.57.1rc1, 0.57.1, 0.58.0rc1, 0.58.0rc2, 0.58.0, 0.58.1, 0.59.0rc1)
ERROR: No matching distribution found for numba==0.54.1

I tried changing the numba version in setup.py to 0.55.0 (which is the closest version that is available) and that makes it install but it crashes with a numpy error then. Should I try to install 0.55.0 and move on with debugging numpy?

Thanks!

which numpy version should i use?

I used numpy==1.20.3 ,but it reported that ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject,I tried to upgrade numpy, but numba 0.54.1 requires numpy<1.21,>=1.17, but you have numpy 1.23.1 which is incompatible. and reported ImportError: Numba needs NumPy 1.20 or less

How to get bpm state space value?

Running offline mode on audio file produces beat and downbeat array. But I can't access any intermediate state space result. I want to get a scalar bpm value or bpm posterior.

Numpy > 1.20 depreciation error

I'm getting

AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:

with numpy version 1.24.4

what is the difference of the three feature extraction model?

Hi,Great work!
I realize the parameter of the 'model'.

I test the model on the same wav but they get different result.

from BeatNet.BeatNet import BeatNet
import  numpy as np
def get_bpm(inp):
    begin = inp[0][0]
    durations = []
    for line in inp[1:]:
        durations.append(line[0] - begin)
        begin = line[0]
    return 60 / np.mean(durations)

for i in range(1,4):
    estimator = BeatNet(i, mode='offline', inference_model='PF', thread=False)

    Output = estimator.process("bpm_tes1.wav")
    # print(Output)
    print(i, get_bpm(Output))

PyAudio Input overflowed

Hello,

I can run with below modification.

def activation_extractor_stream(self):
# TODO:
''' Streaming window
Given the training input window's origin set to center, this streaming data formation causes 0.084 (s) delay compared to the trained model that needs to be fixed.
'''
with torch.no_grad():

                hop = self.stream.read(self.log_spec_hop_length,exception_on_overflow = False)

Unusual licensing choice

Hi Moji,

I wanted to point out that your license choice is a little unusual. Usually Creative Commons licenses aren't used for software.

Per CC themselves: "We recommend against using Creative Commons licenses for software"
https://creativecommons.org/faq/#can-i-apply-a-creative-commons-license-to-software

Given you've chosen CC-BY, have you considered using a more typical permissive attribution based license such as MIT, BSD-3-Clause, or Apache-2.0?

beatnet train script

Hello, I have sent an email. I would greatly appreciate it if I could receive your response. Thank you."

Not able to import beatnet

Add CoreML model conversion script & tutorial

Is it possible to convert the BeatNet algorithm to CoreML to use it on iOS devices? And are there any plans to add a script for CoreML conversion? I would love to see BeatNet work on-device as there really isn't any good beat tracking system for mobile devices yet.

Incompatable with Spleeter?

I was able to get it working but only by installing BeatNet after calling Spleeter, which makes it a little weird to work with. I have an open StackOverflow question on this, but I was wondering if you could resolve it with dependency management of some kind.

https://stackoverflow.com/questions/75838650/spleeter-and-beatnet-incompatible-numpy-numba-libraries-any-solutions

llvmlite error during installation

error: legacy-install-failure

× Encountered error while trying to install package.
╰─> llvmlite

I don't know what's wrong with this package