vettel555,github

acousticfeatureextraction

Acoustic feature extraction using Librosa library and openSMILE toolkit.使用Librosa音频处理库和openSMILE工具包，进行简单的声学特征提取

acousticstoolbox

Various functions for acoustics and audio signal processing.

animal-sounds-embedded-classifier

Area : Combination of machine learning and embedded hardware design Tools used: Raspberry Pi, Microphone, numpy, Scipy, Wi-fi Module We (team of 4) developed a prototype of a embedded application which could be fitted on a safari vehicle. When the safari goes in the jungle for a ride, if it detects sound, it takes a short sample and tries to classify it according to a pre-trained prediction model. The machine learning algorithm used was random forest. We were successfully able to train the model on lion , tiger, peocock, wolf , elephant and several more such wild animals. The sound samples were obtained from several animal sound repositories. The model achieved an accuracy of around 87% on the test data surmounting problems like noise in sound files, lack of extensive training examples. We also used a Wi-fi module so that information about the animal detected can be broadcast to the tourists' mobile devices.

audio-classification

:musical_score: Environmental sound classification using Deep Learning with extracted features

audio-parameter-extraction

given a synthesis process and an audio signal, attempt to extract synthesis parameters from the audio

audio-signal-processing

MATLAB code for analysing Audio Signals and filtering

audio_signal_processing

Practice & Term Project in Audio Signal Processing class (Matlab)

audiofeatureextraction

crnn-audio-classification

UrbanSound classification using Convolutional Recurrent Networks in PyTorch

dcase-2020-baseline

Audio captioning baseline system for DCASE 2020 challenge.

dcase2020_task2_baseline

DCASE2020 Challenge Task 2 baseline system

demonstration-dft-ps-psd

This is a demonstration to show how to calculate power spectra and power spectral densities in real time. We calculate power spectra directly using DFT (or FFT). There are many conventions for DFT. We use the convention is the paper “Analysis of Relationship between Continuous Time Fourier Transform (CTFT), Discrete Time Fourier Transform (DTFT), Fourier Series (FS), and Discrete Fourier Transform (DFT)”. We calculate power spectral and power spectral densities using the MATLAB function periodogram. We could use pwelch to replace periodogram. The only difference between periodogram and pwelch is that pwelch supports segmentation and averaging, whereas periodogram does not. For the sake of simplicity, we only use periodogram in this demonstration. One will see that the power spectrum is equal to the square of the absolute value of DFT. When manually calculating a power spectrum, the hard job is to calculate the argument vector, or the independent variable vector, which is a frequency vector in this case. The frequency vector depends on the representation of the power spectrum. In general, there are three ways to represent a power spectrum for a real valued signal. One way is called “two-sided”. This is the default way to represent a power spectrum with DFT. However, this representation is not intuitive. The frequency vector is calculated by f = (0:N-1)/T, where T is the time period (or duration) of the input signal. When using the MATLAB function, periodogram, one can specify this representation using “onesided”. A more natural way is to use a centered representation. In this case, the frequency 0 is centered in the spectrum. If the number of spectral lines (equal to the number of input points) is odd, then we have a unique centered representation. If the number of spectral lines is even, then we have a problem. Let us assume that we use a zero-based index for spectral lines. The spectral line 0 is the DC component, and it is put in the f = 0 location. However, the spectral line N/2 can be placed on the positive side or the negative side. Different conventions may have different placements. In order to obtain this representation, one has to shift the FFT result. One way is to use the MATLAB function fftshift. This MATLAB function always places the N/2 spectral line on the negative side. When using the MATLAB function, periodogram, one can specify this representation using “centered”. It should be noted that the MATLAB function, periodogram, usually puts the N/2 spectral line on the positive side. The last way to represent a power spectrum is the one-sided representation. For this representation, we need to combine negative frequency components and positive components together, and we only show the positive half as well as the DC component. The combination process depends the evenness or oddness of the number of spectral lines. If the number of spectral lines is odd, we can simply combine spectral lines 1 to (N-1)/2 with spectral lines (N+1)/2 to N-1. The spectral line 0 is left untouched. If the number of spectral lines is even, we need to combine spectral lines 1 to N/2-1 with lines N/2+1 to N-1. The spectral lines 0 and N/2 are left untouched. In order to obtain this representation, one has to manually carry out the combination process. The combination process is different depending on the evenness or oddness of the number of spectral lines. When using the MATLAB function, periodogram, one can specify this representation using “onesided”. In this demonstration, we only use the centered representation. Hence, there is no need to do combination. One can see that the sum of all power spectral lines in a power spectrum is equal to the power of the input signal. One can alternatively calculate the PSD with the periodogram function by specifying “psd” instead of “power”. In fact, the PSD obtained by periodogram is an equivalent noise power spectral density. One can see that ENPSD is related to PS by a factor of 1/T. It should be noted that a power spectrum is a discrete sequence, or a discrete continuous-argument function, whereas an ENPSD is a non-discrete continuous argument function. For emphasize this, I used stem for power spectra and plot for ENPSD. In this demonstration, we start with a sinusoidal signal with various parameters. We then proceed with an actual audio signal.

digital_audio_processing

This is a project as the homework for the class 'signals and systems'. It realizes the A law and u law quantization in digital audio processing.

digital_signal_processing

There are some projects of digital signal processing, including image and audio processing.

diploma

Wavelet Audio Digital Watermarking

dnn-speech-enhancement-demo-tool

Universal Deep neural network based speech enhancement demo and tools, well pre-trained DNN model

emd_hht_tutorial

A tutorial for Time-Frequency estimation using Empirical Mode Decomposition and Hilbert-Huang Transform

esc-50

ESC-50: Dataset for Environmental Sound Classification

hilbert-huang-transform

Hilbert-Huang Transform MATLAB codes

mtf-crnn

Inspired by the convolutional recurrent neural network(CRNN) and inception, we propose a multiscale time-frequency convolutional recurrent neural network (MTF-CRNN) for audio event detection. Our goal is to improve audio event detection performance and recognize target audio events that have different lengths and accompany the complex audio background. We exploit multi-groups of parallel and serial convolutional kernels to learn high-level shift invariant features from the time and frequency domains of acoustic samples. A two-layer bi-direction gated recurrent unit) based on the recurrent neural network is used to capture the temporal context from the extracted high-level features. The proposed method is evaluated on the DCASE2017 challenge dataset. Compared to other methods, the MTF-CRNN achieves one of the best test performances for a single model without pre-training and without using a multi-model ensemble approach.

vettel555 Goto Github PK

vettel555's Projects

Recommend Projects

Recommend Topics

Recommend Org