Coder Social home page Coder Social logo

audiosignalprocessingforml's Introduction

AudioSignalProcessingForML

Code and slides of my YouTube series called "Audio Signal Proessing for Machine Learning"

audiosignalprocessingforml's People

Contributors

musikalkemist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

audiosignalprocessingforml's Issues

Codes needed to be updated

I was going over your Audio processing playlist, which by the way is detailed and has been very helpful in my undergraduate major project. But it looks like lot changed in Librosa, for 4 years ago when you created the one of the best playlists for Audio Processing. While your playlist is still one of the best out there, due to the incompatibility issues in the code, new learners might feel difficult to follow with the tutorial.

Here, is a solution that I think will be helpful for every new learners out there. I was thinking what if we had separate branch called updated or latest where, we will have latest working code such that we will be keeping both the legacy one and the new updated one making it easier for new learners.

If this idea sounds good to you, while doing the course, I ran the program with updated code and hence I have updated code ready for most of the sections and I will be more than happy to contribute the code here.

Doubt in feature extraction

Hi,
I am working on the feature extraction (Mel-spectrogram) for training the ML model for Sound event detection.
I have a doubt regarding the steps required for the feature extraction process. Is it mandatory/required to convert the Amplitude to dB scale (librosa.power_to_db) or I can proceed without it? As after calculating Mel-power Spectrogram I am getting -ve values also which is creating a problem in Log-compression as -ve values give NaN.

Note: I am normalizing a file with its own mean and std.

Logmel Spectrogram- feature extraction

Hi, am doing speech recognition for micro controller. Am new to this and trying to modify the code which is written for Acoustic Scene Classification where they have used 30sec wav audio dataset.

Now, I need to use 1sec dataset for speech recognition but am not getting proper value after feature extraction.

Below are the codes which am using for log mel spectrogram. Can help me pls?

"""LogMel Feature Extraction example."""

import numpy as np
import sys
import librosa
import librosa.display
import scipy.fftpack as fft

SR = 16000
N_FFT = 1024
N_MELS = 30

def create_col(y):
assert y.shape == (1024,)

# Create time-series window
fft_window = librosa.filters.get_window('hann', N_FFT, fftbins=True)
assert fft_window.shape == (1024,), fft_window.shape

# Hann window
y_windowed = fft_window * y
assert y_windowed.shape == (1024,), y_windowed.shape

# FFT
fft_out = fft.fft(y_windowed, axis=0)[:513]
assert fft_out.shape == (513,), fft_out.shape

# Power spectrum
S_pwr = np.abs(fft_out)**2

assert S_pwr.shape == (513,)

# Generation of Mel Filter Banks
mel_basis = librosa.filters.mel(SR, n_fft=N_FFT, n_mels=N_MELS, htk=False)
assert mel_basis.shape == (30, 513)

# Apply Mel Filter Banks
S_mel = np.dot(mel_basis, S_pwr)
S_mel.astype(np.float32)
assert S_mel.shape == (30,)

return S_mel

def feature_extraction(y):
assert y.shape == (32, 1024)

S_mel = np.empty((30, 32), dtype=np.float32, order='C')
for col_index in range(0, 32):
    S_mel[:, col_index] = create_col(y[col_index])

# Scale according to reference power
S_mel = S_mel / S_mel.max()
# Convert to dB
S_log_mel = librosa.power_to_db(S_mel, top_db=80.0)
assert S_log_mel.shape == (30, 32)

return S_log_mel

librosa.feature.rms()

librosa no longer supporting first parameter as y
use y= to address the right parameter
librosa.feature.rms(y=[your loaded file])

How to get the array of envelope in every second?

Hi! I am excited in this calculation. Currently, the calculation is to calculate the whole piece, but I am wondering is there any way to get the array of envelopes in every second? I mean in the first second, the amplitude_envelope is [[0.00000000e+00, 2.32199546e-02, 4.64399093e-02, ..., 4.00544218e+01, 4.00776417e+01, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00], and second is..., and so on.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.