Coder Social home page Coder Social logo

noisereduce's Introduction

Build Status Coverage Status Binder Open In Colab PyPI version

Noise reduction in python using spectral gating

Noisereduce is a noise reduction algorithm in python that reduces noise in time-domain signals like speech, bioacoustics, and physiological signals. It relies on a method called "spectral gating" which is a form of Noise Gate. It works by computing a spectrogram of a signal (and optionally a noise signal) and estimating a noise threshold (or gate) for each frequency band of that signal/noise. That threshold is used to compute a mask, which gates noise below the frequency-varying threshold.

The most recent version of noisereduce comprises two algorithms:

  1. Stationary Noise Reduction: Keeps the estimated noise threshold at the same level across the whole signal
  2. Non-stationary Noise Reduction: Continuously updates the estimated noise threshold over time

Version 3 Updates:

  • Includes a PyTorch-based implementation of Spectral Gating, an algorithm for denoising audio signals.
  • You can now create a noisereduce nn.Module object which allows it to be used either as a standalone module or as part of a larger neural network architecture.
  • The run time of the algorithm decreases substantially.

Version 2 Updates:

  • Added two forms of spectral gating noise reduction: stationary noise reduction, and non-stationary noise reduction.
  • Added multiprocessing so you can perform noise reduction on bigger data.
  • The new version breaks the API of the old version.
  • The previous version is still available at from noisereduce.noisereducev1 import reduce_noise
  • You can now create a noisereduce object which allows you to reduce noise on subsets of longer recordings

Stationary Noise Reduction

  • The basic intuition is that statistics are calculated on each frequency channel to determine a noise gate. Then the gate is applied to the signal.
  • This algorithm is based (but not completely reproducing) on the one outlined by Audacity for the noise reduction effect (Link to C++ code)
  • The algorithm takes two inputs:
    1. A noise clip containing prototypical noise of clip (optional)
    2. A signal clip containing the signal and the noise intended to be removed

Steps of the Stationary Noise Reduction algorithm

  1. A spectrogram is calculated over the noise audio clip
  2. Statistics are calculated over spectrogram of the the noise (in frequency)
  3. A threshold is calculated based upon the statistics of the noise (and the desired sensitivity of the algorithm)
  4. A spectrogram is calculated over the signal
  5. A mask is determined by comparing the signal spectrogram to the threshold
  6. The mask is smoothed with a filter over frequency and time
  7. The mask is appled to the spectrogram of the signal, and is inverted If the noise signal is not provided, the algorithm will treat the signal as the noise clip, which tends to work pretty well

Non-stationary Noise Reduction

  • The non-stationary noise reduction algorithm is an extension of the stationary noise reduction algorithm, but allowing the noise gate to change over time.
  • When you know the timescale that your signal occurs on (e.g. a bird call can be a few hundred milliseconds), you can set your noise threshold based on the assumption that events occuring on longer timescales are noise.
  • This algorithm was motivated by a recent method in bioacoustics called Per-Channel Energy Normalization.

Steps of the Non-stationary Noise Reduction algorithm

  1. A spectrogram is calculated over the signal
  2. A time-smoothed version of the spectrogram is computed using an IIR filter aplied forward and backward on each frequency channel.
  3. A mask is computed based on that time-smoothed spectrogram
  4. The mask is smoothed with a filter over frequency and time
  5. The mask is appled to the spectrogram of the signal, and is inverted

Installation

pip install noisereduce

Usage

See example notebook: Open In Colab Parallel computing example: Open In Colab

reduce_noise

Simplest usage

from scipy.io import wavfile
import noisereduce as nr
# load data
rate, data = wavfile.read("mywav.wav")
# perform noise reduction
reduced_noise = nr.reduce_noise(y=data, sr=rate)
wavfile.write("mywav_reduced_noise.wav", rate, reduced_noise)

Arguments to reduce_noise

y : np.ndarray [shape=(# frames,) or (# channels, # frames)], real-valued
      input signal
  sr : int
      sample rate of input signal / noise signal
  y_noise : np.ndarray [shape=(# frames,) or (# channels, # frames)], real-valued
      noise signal to compute statistics over (only for non-stationary noise reduction).
  stationary : bool, optional
      Whether to perform stationary, or non-stationary noise reduction, by default False
  prop_decrease : float, optional
      The proportion to reduce the noise by (1.0 = 100%), by default 1.0
  time_constant_s : float, optional
      The time constant, in seconds, to compute the noise floor in the non-stationary
      algorithm, by default 2.0
  freq_mask_smooth_hz : int, optional
      The frequency range to smooth the mask over in Hz, by default 500
  time_mask_smooth_ms : int, optional
      The time range to smooth the mask over in milliseconds, by default 50
  thresh_n_mult_nonstationary : int, optional
      Only used in nonstationary noise reduction., by default 1
  sigmoid_slope_nonstationary : int, optional
      Only used in nonstationary noise reduction., by default 10
  n_std_thresh_stationary : int, optional
      Number of standard deviations above mean to place the threshold between
      signal and noise., by default 1.5
  tmp_folder : [type], optional
      Temp folder to write waveform to during parallel processing. Defaults to 
      default temp folder for python., by default None
  chunk_size : int, optional
      Size of signal chunks to reduce noise over. Larger sizes
      will take more space in memory, smaller sizes can take longer to compute.
      , by default 60000
      padding : int, optional
      How much to pad each chunk of signal by. Larger pads are
      needed for larger time constants., by default 30000
  n_fft : int, optional
      length of the windowed signal after padding with zeros.
      The number of rows in the STFT matrix ``D`` is ``(1 + n_fft/2)``.
      The default value, ``n_fft=2048`` samples, corresponds to a physical
      duration of 93 milliseconds at a sample rate of 22050 Hz, i.e. the
      default sample rate in librosa. This value is well adapted for music
      signals. However, in speech processing, the recommended value is 512,
      corresponding to 23 milliseconds at a sample rate of 22050 Hz.
      In any case, we recommend setting ``n_fft`` to a power of two for
      optimizing the speed of the fast Fourier transform (FFT) algorithm., by default 1024
  win_length : [type], optional
      Each frame of audio is windowed by ``window`` of length ``win_length``
      and then padded with zeros to match ``n_fft``.
      Smaller values improve the temporal resolution of the STFT (i.e. the
      ability to discriminate impulses that are closely spaced in time)
      at the expense of frequency resolution (i.e. the ability to discriminate
      pure tones that are closely spaced in frequency). This effect is known
      as the time-frequency localization trade-off and needs to be adjusted
      according to the properties of the input signal ``y``.
      If unspecified, defaults to ``win_length = n_fft``., by default None
  hop_length : [type], optional
      number of audio samples between adjacent STFT columns.
      Smaller values increase the number of columns in ``D`` without
      affecting the frequency resolution of the STFT.
      If unspecified, defaults to ``win_length // 4`` (see below)., by default None
  n_jobs : int, optional
      Number of parallel jobs to run. Set at -1 to use all CPU cores, by default 1
  torch_flag: bool, optional
      Whether to use the torch version of spectral gating, by default False
  device: str, optional
      A device to run the torch spectral gating on, by default "cuda"

Torch

See example notebook: Open In Colab

Simplest usage

import torch
from noisereduce.torchgate import TorchGate as TG
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

# Create TorchGating instance
tg = TG(sr=8000, nonstationary=True).to(device)

# Apply Spectral Gate to noisy speech signal
noisy_speech = torch.randn(3, 32000, device=device)
enhanced_speech = tg(noisy_speech)

Arguments

Parameter Description
sr Sample rate of the input signal.
n_fft The size of the FFT.
hop_length The number of samples between adjacent STFT columns.
win_length The window size for the STFT. If None, defaults to n_fft.
freq_mask_smooth_hz The frequency smoothing width in Hz for the masking filter. If None, no frequency masking is applied.
time_mask_smooth_ms The time smoothing width in milliseconds for the masking filter. If None, no time masking is applied.
n_std_thresh_stationary The number of standard deviations above the noise mean to consider as signal for stationary noise.
nonstationary Whether to use non-stationary noise masking.
n_movemean_nonstationary The number of frames to use for the moving average in the non-stationary noise mask.
n_thresh_nonstationary The multiplier to apply to the sigmoid function in the non-stationary noise mask.
temp_coeff_nonstationary The temperature coefficient to apply to the sigmoid function in the non-stationary noise mask.
prop_decrease The proportion of decrease to apply to the mask.

Choosing between Stationary and non-stantionary noise reduction

I discuss stationary and non-stationary noise reduction in this paper.

Figure caption: Stationary and non-stationary spectral gating noise reduction. (A) An overview of each algorithm. Stationary noise reduction typically takes in an explicit noise signal to calculate statistics and performs noise reduction over the entire signal uniformly. Non-stationary noise reduction dynamically estimates and reduces noise concurrently. (B) Stationary and non-stationary spectral gating noise reduction using the noisereduce Python package (Sainburg, 2019) applied to a Common chiffchaff (Phylloscopus collybita) song (Stowell et al., 2019) with an airplane noise in the background. The bottom frame depicts the difference between the two algorithms.

Citation

If you use this code in your research, please cite it:


@article{sainburg2020finding,
  title={Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires},
  author={Sainburg, Tim and Thielk, Marvin and Gentner, Timothy Q},
  journal={PLoS computational biology},
  volume={16},
  number={10},
  pages={e1008228},
  year={2020},
  publisher={Public Library of Science}
}

@software{tim_sainburg_2019_3243139,
  author       = {Tim Sainburg},
  title        = {timsainb/noisereduce: v1.0},
  month        = jun,
  year         = 2019,
  publisher    = {Zenodo},
  version      = {db94fe2},
  doi          = {10.5281/zenodo.3243139},
  url          = {https://doi.org/10.5281/zenodo.3243139}
}



Project based on the cookiecutter data science project template. #cookiecutterdatascience

noisereduce's People

Contributors

fardage avatar inarighas avatar kareemamrr avatar lff5 avatar nuniz avatar rjfarber avatar timsainb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

noisereduce's Issues

noisereduce documentation typo

There is a typo on the noisereduce documentation page at https://pypi.org/project/noisereduce/

The mask is appled to the spectrogram of the signal, and is inverted If the noise signal is not provided, the algorithm will treat the signal as the noise clip, which tends to work pretty well

appled should say "applied"

Document content of `y_noise` arg may be misunderstanding

In the document string of the y_noise argument, it said:

y_noise : np.ndarray [shape=(# frames,) or (# channels, # frames)], real-valued noise signal to compute statistics over (only for non-stationary noise reduction).

But actually, an array of stationary noise can be passed to and well processed by the function(when stationary==True). Am I right?

MemoryError

I m on windows 10 and jupyter environment, the audio file lasted 30 minutes, so I cut the file in 10 seconds each and then continue, on the first chunk0 file came across MemoryError.

Is the file still large for the situation? I followed the link :
https://colab.research.google.com/github/timsainb/noisereduce/blob/master/notebooks/1.0-test-noise-reduction.ipynb#scrollTo=E5UkLtmT3xy3, the sample only lasted four seconds.

Or the paramter tuning would help for this ?

myaudio = AudioSegment.from_file(('myaudio.wav') , "wav") 
chunk_length_ms = 10000 # pydub calculates in millisec
chunks = make_chunks(myaudio, chunk_length_ms) #Make chunks of one sec

#Export all of the individual chunks as wav files

for i, chunk in enumerate(chunks):
    chunk_name = "chunk{0}.wav".format(i)
    print ("exporting", chunk_name)
    chunk.export(chunk_name, format="wav")

data, rate = sf.read('chunk0.wav')
data = data

reduced_noise = nr.reduce_noise(y = data, sr=rate, n_std_thresh_stationary=1.5,stationary=True)

MemoryError: Unable to allocate 197. GiB for an array with shape (441000, 60002) and data type float64

noise reduction produce echo in the output wav

Hi all,
I tried the reduce_noise function on a noisy clip with a sample of noisy part from the original clip, the output from the function include echo and the voice sound different than the one in original clip.
any help/fix for this issue

thanks

use_tqdm does the opposite

Not very disturbing, but when I use the function nr.reduce_noise(), by default I see the progress bar, and if I set use_tqdm to True I don't see it.

How can I save the reduced noise in a wav file??

I listen and plot the reduced noise, but how can I save the reduced noise in a wav file??

I save like this:

librosa.output.write_wav('./prueba/audio_reduce1.wav', reduced_noise, sr)

But the wav file is noise.

Help me please!

Noise in real file

Hi, it's a great contribution but i have a question, if you only have one audio, so you don't have a extra file to the noise?
This is a typical problem, that you want to remove noise of a real audio.

Query

I want to remove environmental sounds from multiple files but of different lengths. Does this process require noise.wav to be of the exact same length as that of the original .wav file?

GPU support?

Hello!,

I am trying to use noisereduce's reduce_noise function to clean some audio data. When I use a GPU (V100), some of the cleaned audio arrays come back as all NaN. This does not happen when I run the code on CPU. Any idea of how I might be able to fix this? Thanks!

Loss of audio volume & clarity with noise

I tested this on multiple audio files (16kHz telephony audio) with prop_decrease ranging from (0,0.1,0.5,0.75,1)

My 3 observations were:

  1. It does a decent job of removing consistent noise like windy audio.
  2. Not so good with noise bursts
  3. The final sample often has too low of a volume and isn't clear for additional processing (like transcription)

Could you help me identify if there's something wrong with my use of it or if there are any improvements that can be made.

(Also, I got floating point expected error so instead of using wavfile.read(), I used librosa.load(filepath,sr=None))

how to tune parameter for industiral audio data?

i am highly interesting in this project and currently stuck on my academic research. i am looking for a model to distinguish my specific machine in good or bad conditions. and the working sound of my machines are the important feature i need to use(easy to get), i am wondering could i tune this model for my work? thanks for your attension.

Applying noise reduction on pyaudio stream

Hello,
First of all, thank you for your great module.

I'm trying to apply real-time noise reduction on incoming audio stream

Settings on stream opening:

  • Mono , sampling_rate = 16kHz, frames_per_buffer= 16000

Settings on stream reading:

  • stream.read(16000, exception_on_overflow = False)

The problem that I'm facing is that a periodical "fan spinning" sound appears, after actively applying noise reduction (while loop), but this sound does not appear on a normal 5 second recording with noise reduction on the np.int16 array afterwards.

What is different is that in the first case (active-ish noise reduction), I append the sound data for each iteration,after noise reduction, whereas in the second case I record for 5 seconds and THEN apply the noise reduction on the whole set of data.

I'm uploading example wavs to give you a better perspective:

normal_case_wavs.zip
active_case_wavs.zip

P.S I noticed that by changing the number of frames I read from the stream buffer, the frequency of this sound changes too. Could this be some kind of edge case where this sound indicates the change of "Sound CHUNK" I am processing (appending to the list for future .wav write) ?

Crucial part of code is here:

stream= p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=16000)

for i in range(0, int(16000 / 16000 * 5)):
    data = stream.read(16000)    
    sound_data_npint16 = np.hstack(np.fromstring(data, dtype=np.int16))
    noisy_frames.append(sound_data_npint16)

    sound_data_float = np.ndarray.astype(sound_data_npint16,float)/32768
    reduced_noise_float = nr.reduce_noise(audio_clip=sound_data_float, noise_clip=noise, verbose=False, n_fft=4096, n_std_thresh=1, pad_clipping=True) #Tried both pad_clipping=True/False
    reduced_noise_npint16 = np.ndarray.astype(np.iinfo(np.int16).max*reduced_noise_float,dtype=np.int16)

    denoised_active_frames.append(reduced_noise_npint16)

total_noisy_frames = np.hstack(noisy_frames) # noisy frames gathered
total_noisy_frames_float = np.ndarray.astype(total_noisy_frames,float)/32768
reduced_total_noisy_frames_float = nr.reduce_noise(audio_clip=total_noisy_frames_float, noise_clip=noise, verbose=False, n_fft=4096, n_std_thresh=1, pad_clipping=True)
reduced_total_noisy_frames_npint16 = np.ndarray.astype(np.iinfo(np.int16).max*reduced_total_noisy_frames_float,dtype=np.int16)

# noise wav comes from  "noisy_frames"
# actively denoised wav comes from "denoised_active_frames"
# denoised wav after 5seconds recording comes from "reduced_total_noisy_frames_npint16"

Fix Import Errors

After installing this library for python 2.7, i found the following issues with file imports.

  1. In the init.py section, the first line i.e from noisereduce.noisereduce just needs to be from noisereduce import reduce_noise

2)In the noisereduce.py file the 4th line i.e the from noisereduce.plotting simply needs to be from plotting import plot_reduction_steps.

I cannot create a new branch and push the code for fixing this issue.If you get the time can you please fix these 2 import issues?

Thanks.

MemoryError Traceback (most recent call last)

MemoryError Traceback (most recent call last)
MemoryError: Unable to allocate 85.3 GiB for an array with shape (190747, 60002) and data type float64

I try simple usage same like on README.md but I have that error
My audio duration is just 4.3 second

Thank you
image
image

hop_length not working

If I add hop_length parameter to the noise_reduce function, it gives an error:

`
import numpy as np
import noisereduce as nr

This works

nfft = 512
fs = 8000
signal = np.loadtxt('signal.csv')
nr.reduce_noise(y=signal, sr=fs, n_fft=nfft, win_length=nfft)

This does not work

nfft = 512
fs = 8000
signal = np.loadtxt('signal.csv')
nr.reduce_noise(y=signal, sr=fs, n_fft=nfft, win_length=nfft, hop_length=100)
`
I tried with different values and it always gives the same error:

AttributeError: 'SpectralGateNonStationary' object has no attribute '_hop_length'

signal.csv

Feature Request or Advice

Do you have a method for creating a common background noise file? I have a bunch of wav files that I would like to extract the common background noise from them instead of guessing. If not have you thought about adding this into your package?

Thanks for any help. I'm not experienced at GitHub posting so I hope this is an ok way to send ideas or requests.

Option to return denoised spectrum and mask

For the nonstationary algorithm, options to return sig_stft_denoised and sig_mask could be super useful for a few use cases

  1. If noisereduce is being incorporated into an audio processing pipeline, returning sig_stft_denoised could be used for further spectral methods, rather than risking loss of info converting from spectrum to signal to spectrum again.
  2. When trying to get the parameters right for challenging cases, getting back the sig_mask would allow for visual inspection of the effect of each parameter change.

bug: Masked signal isn't calculated correctly which causes distortion

If prop_decrease=0 the recovered_signal returned in reduce_noise function should be the same as the input signal to the function. This doesn't happen.
The reason it doesn't happen is because of a bug in mask_signal function. Inline 113 sig_stft_db_masked is calculated and later treated as it only contains the masked real part of the stft. This is wrong since sig_stft_db contains the magnitude of the stft (Real and Imaginary) which means that sig_stft_db_masked cannot contain only the real part.
This bug causes the real part of sig_stft_amp to be too large, which causes some sort of distortion.

RAM getting crashed for a 4 second audio

Hi, I have been trying to reduce the background stationary noise from a 4 second audio signal but the RAM crashes on Google Colab.
It gives the user warning as n_fft=1024 is too small for input signal of length=2 n_fft, y.shape[-1]
The sampling frequency of the audio signal is 44100 Hz, file type is wav file and is a 16 bit file.
I have attached the audio signal here by converting it into mp4 format because github wasn't allowing to upload .wav file.
Thanks

audio_cutter.mp4

Frames are being removed

I would like to congratulate on your great work, the library works as promised.
But there is a small issue which i found.
When i load the file with librosa.load and give that array as a input to reduce_noise, length of the array gets changed, what is the reason behind this. For a certain project of mine i am trying to use this library on small clips hardly of 1secs, but if such frames keep on vanishing i will loose a lot of important data. What is the workaround for this, moreover why are no of frames getting reduced?

Capture

Frequency shift after noiseReduction

Hi, I've been using the noise reduction algorithm to standardize the noise before input the signal into a deep learning model. The issue I found while doing the noise reduction is a slightly frequency shift to lower values on the signal.

Original:
image

Reduced noise:
image

The shift becomes evident when looking the colors near the 4096 frequencies on the mel spectogram.

My Question is, How can I avoid this shift to lower frequencies? (It happens when reducing Stationary or Non-stationary noise)

Thank you

use_tqdm parameter uses tqdm when False

Hi, it seems use_tqdm=False leads to the use of tqdm, while use_tqdm=True leads to no use of tqdm. In the source code:
tqdm(pos_list, disable=self.use_tqdm)
So use_tqdm=False means disable=False so enable is True...

GPU based error /

HI. Now it seems that new module is only GPU based???
or I am making any mistake?? It is giving this with only CPU based machine.
Please see error.. and help

----> 1 import noisereduce as nr

~/anaconda3/envs/server/lib/python3.6/site-packages/noisereduce/init.py in
----> 1 from noisereduce.noisereduce import reduce_noise

~/anaconda3/envs/server/lib/python3.6/site-packages/noisereduce/noisereduce.py in
10
11 print(
---> 12 "GPUs available: {}".format(tf.config.experimental.list_physical_devices("GPU"))
13 )
14 if int(tf.version[0]) < 2:

AttributeError: module 'tensorflow' has no attribute 'config'

check noise

can we check there is noise or not in wav file ?

code fails whilst performing STFT on signal "Killed: 9"

Code:

import noisereduce as nr
import scipy.io.wavfile as RE

load data

rate, data = RE.read("sample.wav")
data = data/1.0

select section of data that is noise

noisy_part = data[0:37526917]
noisy_part = noisy_part.sum(axis=1) / 2

perform noise reduction

reduced_noise = nr.reduce_noise(audio_clip=data, noise_clip=noisy_part, verbose=True)
RE.write('clean_audio.wav', rate, reduced_noise)

Code fails during STFT and prints "Killed: 9"

Never completes.

Hi -

Even when using the "Simplest usage" code, it never returns, until it throws a signal 10+ minutes later, on a twenty-second wav file.

Does this require linux? (I'm on a mac), or some particular version of python? (I'm using 3.10).

Currently, it does not work for me, at all.

How do we compute the residual signal

Hi @timsainb thanks for making the repo public.
I was trying to separate a given audio recording into speech and residual noise in the non-stationary case. I couldn't find a clean way to obtain the residual noise. Any suggestions?
According to the code the denoised version is obtained as sig_stft_denoised = sig_stft * sig_mask. I tried to obtain the residual as sig_stft_residual = sig_stft * (1-sig_mask) but this doesn't seem to behave as expected since sig_mask values are not always bound to [0, 1]

Noise Reduce VST

It would be very interesting if we could have this project in a VST plugin, because in this case it would be possible to filter in real time, not just for a specific file. Have you guys thought about it?

Amplifying noise

First thanks @timsainb for porting this into python!

Unfortunately I have issues on some files for which the noise is actually amplified.

With the default parameters, this is an example of result that I get https://drive.google.com/open?id=1By4_l1kM9s6K013j6iRwQNsVx3Rvu6Nv

I have tried to play with the parameters but it didn't help. Even setting the noise to zero or prop_decrease to zero produces similar additional noise.

Please let me know if it is expected, otherwise I will dig into the code. Thanks!


For reference, this is my code:

import librosa
from noisereduce import reduce_noise


file_path = "raw.wav"

s, e = 33.99000000000053, 35.520000000000586

a, sr = librosa.load(file_path, sr=None)

i_s = int(sr*s)
i_e = int(sr*e)
a_noise = a[i_s:i_e]
a_clean = reduce_noise(a, a_noise, prop_decrease=0)


librosa.output.write_wav("noise.wav", a_noise, sr)
librosa.output.write_wav("denoised.wav", a_clean, sr)

I have a problem Please help me

Hello! I will tell you my problem and please help me.
I am using Bluetooth headphones which I wear inside of a helmet. There is usually noise when recording any wav file inside of python. I thought about your library as I was thinking about removing the noise from the recorded wave file. The noise is caused by Bluetooth connection and the noise created by the car. I did the following, I recorded a 2-second wav file without any speech. This file mainly used to record the noise in the environment I am sitting in. Then, I recorded another file with me talking in the same environment for 2 seconds.

`rate, data = read("noise.wav")#the 2 second noise without any speaking
rate2, data2 = read("testing3.wav")#the 2 second noise + me talking
noisy_part = data[:]
reduced_noise = nr.reduce_noise(audio_clip=data, noise_clip=noisy_part, verbose=False)
write("lili.wav", frequency, reduced_noise)#saving the data into a new wav file'

I keep getting this error. I have a deadline tomorrow and I hope you could help me find out the problem in this error. I was wondering if you might have any ideas whether the method I am doing to remove the noise from the 2-second wave file will work or not. If you have any better methods I would be very thankful. I tried using high pass low pass bandwidth mean filter separately but the results are not that promising. It would great if you could give me your advice.
This is a link to the code and the files
https://drive.google.com/open?id=1J9abNuhtJpI01pm-LIKFg8zvS2JBR3rK
The setting of the wav file is
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 2

File "test.py", line 101, in
reduced_noise = nr.reduce_noise(audio_clip=data, noise_clip=noisy_part, verbose=False)
File "/home/mostafahaggag/.local/lib/python3.5/site-packages/noisereduce/noisereduce.py", line 123, in reduce_noise
noise_stft = _stft(noise_clip, n_fft, hop_length, win_length)
File "/home/mostafahaggag/.local/lib/python3.5/site-packages/noisereduce/noisereduce.py", line 9, in _stft
return librosa.stft(y=y, n_fft=n_fft, hop_length=hop_length, win_length=win_length)
File "/home/mostafahaggag/.local/lib/python3.5/site-packages/librosa/core/spectrum.py", line 161, in stft
util.valid_audio(y)
File "/home/mostafahaggag/.local/lib/python3.5/site-packages/librosa/util/utils.py", line 159, in valid_audio
raise ParameterError('data must be floating-point')
librosa.util.exceptions.ParameterError: data must be floating-point

Disabled TQDM is still on

Hello, I have suspicion that the TQDM disabling does not work properly when another tqdm progress bar is already in use

image

and the used function is:

import torchaudio
import noisereduce as nr

def create_spectrogram(fname, reduce_noise: bool = False):
    waveform, sample_rate = torchaudio.load(fname, normalize=True)
    waveform = waveform[0]
    if reduce_noise:
        waveform = torch.tensor(nr.reduce_noise(y=waveform, sr=sample_rate, win_length=256, use_tqdm=False, n_jobs=-1))
    transform = torchaudio.transforms.Spectrogram(n_fft=3600, win_length=256)
    spectrogram = transform(waveform)
    return torch.log(spectrogram).numpy()

ParameterError

import pydub
from pydub import AudioSegment as am
import noisereduce as nr
from scipy.io import wavfile
import numpy as np

rate, sound1 = wavfile.read("test with noise.wav")
sound1 = sound1.astype(np.float32)

noisy_part = sound1[10000:15000]

reduced_noise = nr.reduce_noise(audio_clip=sound1, noise_clip=noisy_part, verbose=True)

The Above is the simply code I used, and when I run it, I get the following error
ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(5000, 2)

I changed sound1 to a float because I encountered an error that asked me to change the variable to float.
The size of sound1 is (660672, 2)
The size of noisy_part is (5000, 2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.