Coder Social home page Coder Social logo

lcav / pyroomacoustics Goto Github PK

View Code? Open in Web Editor NEW
1.4K 44.0 421.0 93.83 MB

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Home Page: https://pyroomacoustics.readthedocs.io

License: MIT License

Python 88.84% C++ 10.84% Dockerfile 0.07% Cython 0.25%
acoustics room-impulse-response image-source-model beamforming doa adaptive-filtering stft audio

pyroomacoustics's People

Contributors

4bian avatar 645775992 avatar cyrilcadoux avatar duembgen avatar ebezzam avatar fakufaku avatar grantgasser avatar hrosseel avatar hutauf avatar ibebrett avatar jazcarretao avatar maldil avatar marek-obuchowicz avatar mattpitkin avatar mori97 avatar nimo139 avatar noahmfoster avatar oleg-alexandrov avatar orchidas avatar satokiogiso avatar satvik-dixit avatar sekiguchi92 avatar taishi-n avatar womac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyroomacoustics's Issues

Speed of sound doesn't seem to be constant

Hi,

I am currently using pyroomacoustics for delayed & attenuated mixtures in blind source separation. That is an awesome tool. Yet I have a little problem as I found that the speed of sound isn't constant in pyroomacoustics.

How to reproduce:

CODE:

fs, s1 = wavfile.read('/home/felix/Simulation salle/Test reberv = 0,2 et 5 images/ref_1200Hz_pure.wav')

# room dimension
room_dim = [5, 5]

# Create the shoebox
shoebox = pra.ShoeBox(
    room_dim,
    absorption=0.999,
    fs=fs,
    max_order=0,
    )

# source and mic locations
shoebox.add_source([2, 3], signal=s1)

R = np.array([[2,2],[2,1]])
bf = pra.MicrophoneArray(R,shoebox.fs)
shoebox.add_microphone_array(bf)

# run ism
shoebox.simulate()

shoebox.plot()
plt.grid()
plt.show()

audio_reverb = shoebox.mic_array.to_wav('/home/felix/test_reverb.wav', norm=True, bitdepth=np.int16)

ILLUSTRATIONS:

github3 (The .wav file I used)
github1
github2 (The results recorded by pyroomacoustics)

Although the distance between the source and first mic is d1 = 1 m and to the second mic is d2 = 2 m, the delay I found didn't double (see the screenshot in Audacity). Why is that? It seems wrong to me. (Note that in my example there is no reverberation involved, simply delays)

Thank you!

EDIT : One possibility would be that there is a slight delay at the beginning before my source starts emitting somehow due to pyroomacoustics. That would make more sense.
EDIT2 : That's actually that. I looked at e.g. the delay of arrival between the two mics of a peak, and that is exactly the time needed to propagate through 1 m. This issue helped too ( #23 ). Sorry for posting too fast... Thanks again, your tool do help me a lot!

Different sampling rates for audio sources

Currently the sampling rate is defined by the room - what is the recommended way of dealing with different sampling rates in my audio sources? Resample them beforehand?

DOA Azimuth and Colatitude angles in reference to

When I perform any of the DOA algorithms, I get a azimuth and colatitude reconstructed angle. Where exactly is my baseline for these angles to compare? Is it from the center of my microphone array?

Let's give an example.

Say I have two microphones 0.05 meters apart. I run the simulation and I perform DOA and I get an azimuth angle of -78 degrees and colatitude of 89 degrees. Where is the -78 degrees angle and colatitude of 89 degrees in reference to? Is it the center of the two microphones?

Make matplotlib imports optional throughout the library

When matplotlib is loaded on a system without a display (i.e. servers or computing nodes), calling matplotlib with a different backen than 'Agg' will cause a crash.

The solution is to limit matplotlib imports to calls that actually plot something.

Frequency dependant absorption coefficients

In the Pyroomacoustics paper, an assumption is made on the absorption coefficients of the walls:

"The model is accurate only as long as the wavelength of the sound is small relative to the size of the reflectors, which it assumes to be uniformly absorbing across frequencies."

  • On what do you base your assumption that there is uniform absorption across frequencies? If you look at the tables with absorption coefficients of different construction materials, frequency specific coefficients are given.

  • To what wavelength - reflector ratio is the model accurate? Our goal is to use ultrasonic sound signals in the indoor environment.

  • Howis this uniform absorption implemented in the simulation environment?

  • How difficult is it to implement frequency specific absorption, as used in the matlab Roomsim environment?

Seperation of Multiple Sources.

Hi,
at first I want to thank you for writing such a nice library. I hope I can contribute to pyroomacoustics in the next months.

I started to write my master thesis in the topic of multi source localisation. I wrote a script derived from your example to see how the different algorithm performance. The scenario is to separate two white noise sources with additive uncorrelated white noise (SNR = 0dB). I was very exited to see how FRIDA performance because in your paper it separated the best. However in the simulation FRIDA had the highest error (see plot). Maybe I didn't configure the used bins right. Do you have any suggestions how to improve the performance? Is the setup maybe wrong?

figure_1

Output:

SRP
Recovered azimuth: [ 72. 109.] degrees
MUSIC
Recovered azimuth: [ 70. 110.] degrees
FRIDA
Recovered azimuth: [ 35.62516827 90.06240039] degrees
TOPS
Recovered azimuth: [ 68. 112.] degrees

Here is the script:

example.py:

import numpy as np
import matplotlib.pyplot as plt
import pyroomacoustics as pra
from scipy.signal import fftconvolve

#######################
# algorithms parameters
SNR = 0.    # signal-to-noise ratio
c = 343.    # speed of sound
fs = 16000  # sampling frequency
nfft = 256  # FFT size
azimuth_list_degree = [70, 110]
# used frequency bins, dependent on the method
freq_bins = {'SRP': np.arange(30, 129),
             'MUSIC': np.arange(30, 129),
             'FRIDA': np.arange(1, 129),
             'TOPS': np.arange(1, 129)}

###########################################
# We use a UCA-C with radius 4.2 cm and 7 microphones.
# This setup is used by the Amazon Echo
R = pra.circular_2D_array([0, 0], 6, 0., 0.042)
R = np.append(R, [[0], [0]], axis=1)

######
# location of original source
num_src = len(azimuth_list_degree)
azimuth_list = np.array(azimuth_list_degree) / 180. * np.pi

x_length = (nfft // 2 + 1) * nfft
azimuth_length = azimuth_list.shape[0]
mic_signals_part = np.zeros([7, x_length, azimuth_length])
for i, azimuth in enumerate(azimuth_list):
    # propagation filter bank
    propagation_vector = -np.array([np.cos(azimuth), np.sin(azimuth)])
    delays = np.dot(R.T, propagation_vector) / c * fs  # in fractional samples
    filter_bank = pra.fractional_delay_filter_bank(delays)

    # we use a white noise signal for the source
    x = np.random.randn(x_length) / azimuth_length

    # convolve the source signal with the fractional delay filters
    # to get the microphone input signals
    mic_signals_part[:, :, i] = np.array([fftconvolve(x, filter, mode='same') for filter in filter_bank])

# Sum all signal parts to one signal
mic_signals = np.sum(mic_signals_part, axis=2)

# now add the microphone noise
for signal in mic_signals:
    signal += np.random.randn(*signal.shape) * 10**(- SNR / 20)

################################
# compute the STFT frames needed
X = np.array([
    pra.stft(signal, nfft, nfft // 2, transform=np.fft.rfft).T
    for signal in mic_signals])

##############################################
# now we can test all the algorithms available
algo_names = ['SRP', 'MUSIC', 'FRIDA', 'TOPS']
fig = plt.figure(1, figsize=(5*2, 4*2), dpi=90)
i_sps = 1
for i, algo_name in enumerate(algo_names):
    # construct the new DOA object
    # the max_four parameter is necessary for FRIDA only
    doa = pra.doa.algorithms[algo_name](R, fs, nfft, c=c, max_four=4, num_src=num_src)

    # this call here perform localization on the frames in X
    doa.locate_sources(X, freq_bins=freq_bins[algo_name])
    plt.figure(1)
    ax = fig.add_subplot(2, 2, i+1, projection='polar')
    doa.polar_plt_dirac(azimuth_ref=azimuth_list)
    plt.title(algo_name, y=1.08)

    # doa.azimuth_recon contains the reconstructed location of the source
    print(algo_name)
    print('  Recovered azimuth:', np.sort(doa.azimuth_recon) / np.pi * 180., 'degrees')
plt.subplots_adjust(top=0.9,
                    bottom=0.075,
                    left=0.05,
                    right=0.95,
                    hspace=0.4,
                    wspace=0.4)
plt.show()

in doa.py the polar_plt_dirac() function has to be adjusted to have this subplot output:

        ....
        #fig = plt.figure(figsize=(5, 4), dpi=90)
        #ax = fig.add_subplot(111, projection='polar')
        ax = plt.gca()
        ....

Thank you for your help! :)

Adding datatype parameter to STFT class

When using the new STFT object contained in pyroomacoustics.stft I ran into a precision issue that I could solved by manually casting (from complex64 to complex128) the output data type of the analysis and synthesis method.

Currently, pyroomacoustics.stft uses a datatype of 32 bits (float32/complex64). I propose adding another input parameter to the constructor in order to specify the bits width (32 bits or 64 bits). The default value would be 32. This is the same idea as the bits input that is already used in the dft.py constructor.

When fixing my particular issue, I started implementing this feature in this branch:
https://github.com/jazcarretao/pyroomacoustics/tree/juan-dev-stft

The main changes in stft.py can be found in this commit:
jazcarretao@7ba2dd2

Thank you for your attention.

Singular matrix error

Hi While running for linear array..many times it gives singular matrix error in almost all of the algorithms.. what care should we take while recording..

Advice on measuring beamforming performance

Iโ€™m simulating differing rooms and mic array shapes, I want to be able to generate a metric that shows the improvement or degradation of the beamformed signal. What we be the best way of achieving this?

Many thanks

Tao

Instantiating ShoeBox with two methods differ

Hi authors,

Given the parameters:

room_dim = [15, 14, 16]
absorption = 0.2
source_position = [2.0, 3.1, 2.0]
mic_position = [2.0, 1.5, 2.0]
fs = 16000
max_order = 15

It seems that the two instances go different ways:

Scenario A

source = SoundSource(position=source_position)
mics = MicrophoneArray(np.array([mic_position]).T, fs)
shoebox = ShoeBox(room_dim, absorption=absorption, fs=fs, max_order=max_order, sources=[source], mics=mics)

Scenario B

shoebox = ShoeBox(room_dim, absorption=absorption, fs=fs, max_order=max_order)
shoebox.add_source(source_position)
mics = MicrophoneArray(np.array([mic_position]).T, fs)
shoebox.add_microphone_array(mics)

Then run and plot the image source model as

shoebox.image_source_mode()
shoebox.plot(img_order=2)

My experiments show that the instantiation of using SoundSource as the arguments fails, and only the method add_source is feasible. Please fix the bugs if exist

[next_gen_simultator] error when multi-microphone array and this type of room

Dear developer,
I am running the following script:

import numpy as np
import pyroomacoustics as pra
import matplotlib.pyplot as plt

from stl import mesh
from mpl_toolkits import mplot3d

path_to_musis_stl_file = './data/raw/MUSIS_3D_no_mics_simple.stl'
haru_fs = 16e3
haru_bar = [-6.5, 8.5, 2+0.1]
src_pos  = [-6, 4, 3]

def pyroom_walls_from_stl(path_to_stl_file):
    # with numpy-stl
    the_mesh = mesh.Mesh.from_file(path_to_stl_file)
    ntriang, nvec, npts = the_mesh.vectors.shape
    size_reduc_factor = 500.  # to get a realistic room size (not 3km)

    material = pra.Material.make_freq_flat(0.2)

    # create one wall per triangle
    walls = []
    for w in range(ntriang):
            walls.append(
                pra.wall_factory(
                    the_mesh.vectors[w].T / size_reduc_factor,
                    material.absorption["coeffs"],
                    material.scattering["coeffs"],
                )
            )

    return walls

def get_haru_mics(baricenter):
    haru_bar = np.array(baricenter)
    haru = pra.circular_2D_array(
        center=haru_bar[:2], M=8, phi0=-45/180*np.pi, radius=0.095)[:, :-1]
    haru = np.concatenate((haru, np.ones([1, 7])*haru_bar[2]), axis=0)
    haru = haru[:, ::-1]
    return haru


## Laod room into pyroomacoustic
walls = pyroom_walls_from_stl(path_to_musis_stl_file)
room = pra.Room(
    walls,
    fs=16000,
    max_order=3,
)

room.add_source(src_pos)
room.add_microphone_array(
    pra.MicrophoneArray(get_haru_mics(haru_bar), haru_fs)
)

# compute the rir
room.image_source_model()
# room.ray_tracing()
# room.compute_rir()
# room.plot_rir()

# show the room
room.plot(img_order=1)
plt.show()

to load and compute the RIR of the following model of a 3D non-convex room (there is no ceiling.):
MUSIS_3D_no_mics_simple.stl.zip
With the following (cool) results (room.plot(img_order=1)):
requirements.txt

musis_pyroom_im

However, I am encountering the following problems:

  1. room.compute_rir() return an error of sound sources outside the room:
../pyroomacoustics/pyroomacoustics/room.py", line 1136, in compute_rir
    n_bins = np.nonzero(self.rt_histograms[m][s][0].sum(axis=0))[0][-1] + 1
AttributeError: 'Room' object has no attribute 'rt_histograms'

UPDATE This error happens when multiple microphone are used.

  1. the image source wrt to the floor is inside the room (it seems that the image method is running on the convex hall. Thus, if theri is a discontinuty, like the 'plinth' I am using, the floor is considered at the level of the plinth. EDIT this is actually right, my mistake. It is the right image source with respect to the plinth

Room attributes

Room contains attributes that perhaps don't belong eg. normals, absorption (wall atrributes), or fs, max_order, sigma2_awgn (these are for processing not inherent attributes?)

Future warning when fftSize and hopSize is defined on beamforming techniques

Hi all,

I was trying to increase fftSize in Microphone.Beamformer object to compute far_field_weights and sum and delay weights. However I got a warning related with the last line of Microphone.Beamformer .filters_from_weights()

/home/xxxxxxxxx/anaconda2/envs/xxxxxxxxx/lib/python3.5/site-packages/pyroomacoustics/beamforming.py:455: FutureWarning: rcondparameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to passrcond=None, to keep using the old, explicitly pass rcond=-1. self.filters[i] = np.real(np.linalg.lstsq(F, w)[0])

Thanks in advance,
XL

Trying to find the distance between the source and the microphones

Hi,

I am trying to find the distance between the center of my microphone array and the source sound in a 3-D environment. I have looked through the source code, and I see it can support it. However, I am unsure of two things. The first is if the DOA methodsโ€”SRP, MUSIC, TOPS, FRIDAโ€”themselves can support source localization in 3-d. And secondly, regardless if they can or cannot, I am trying to use the algorithms so find me the distance between the source and the center of my microphone array. Is this possible to do?

NOTE: I have put in comments safety check for a comparison sake. I would like to remove this once I figure out the problems listed above.

See my code below:

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
from scipy.signal import fftconvolve
import IPython
import pyroomacoustics as pra

#Build 3-D Representation of the room
room = pra.ShoeBox([13.75,8.66,9.5], fs=16000, absorption=0.35, max_order=17)

#Add Source
 fs,signal
=wavfile.read("/Users/akhilvasvani/Downloads/cmu_us_awb_arctic/wav/arctic_b0528.wav")

room.add_source([(13.75/2),(8.66/2),(9.5/2)], signal=signal)

#Add 12-microphone array
R = np.c_[       # [x, y, z]
    [2.85, 0, 0.49], # mic 4
    [4.82, 0, 0.49], # mic 8
    [5.80, 0, 0.49], # mic 12
    [2.85, 0, 2.65], # mic 3
    [4.82, 0, 2.65], # mic 7
    [5.80, 0, 2.65], # mic 11
    [2.85, 0, 4.81], # mic 2
    [4.82, 0, 4.81], # mic 6
    [5.80, 0, 4.81], # mic 10
    [2.85, 0, 6.97], # mic 1
    [4.82, 0, 6.97], # mic 5
    [5.80, 0, 6.97]  # mic 9
    ]

room.add_microphone_array(pra.MicrophoneArray(R, room.fs))

fig, ax = room.plot()
ax.set_xlim([0, 14])
ax.set_ylim([0, 9]);
ax.set_zlim([0, 10]); # for 3-d

room.simulate()

from pyroomacoustics.doa import great_circ_dist

#Safety-Check
center = [4.325, 0, 3.730]
source = [(13.75/2),(8.66/2), (9.5/2)] # I have specified the source location. 
temp = np.arctan([0,(source[2]-center[2])/(source[0]-center[0])]) #for some dumb reason, we need to have an array with[0, (desired value)]
radius = np.sqrt(((source[2]-center[2])**2)+((source[0]-center[0])**2)) #

azimuth = temp[1] # in radians
c = 343.    # speed of sound
fs = 16000  # sampling frequency
nfft = 256  # FFT size
freq_range = [2900, 3500]

X = np.array([pra.stft(signal, nfft, nfft // 2, transform=np.fft.rfft).T for signal in room.mic_array.signals])

# Now we can test all the algorithms available
algo_names = ['SRP', 'MUSIC', 'FRIDA', 'TOPS'] 
spatial_resp = dict()

for algo_name in algo_names:
    # Construct the new DOA object
    # the max_four parameter is necessary for FRIDA only
    doa = pra.doa.algorithms[algo_name](L=R, fs=fs, nfft=nfft, c=c, num_src=1, max_four=4)

    doa.sph_plot_diracs()
    plt.title(algo_name)

    # doa.azimuth_recon contains the reconstructed location of the source
    print(algo_name)
    print('  Recovered azimuth:', doa.azimuth_recon / np.pi * 180., 'degrees')
    print('  Error:', great_circ_dist(azimuth, doa.azimuth_recon) / np.pi * 180., 'degrees')

    plt.show()
    print((doa.azimuth_recon))

Amplitude normalization

After define a room and feed the input signal, we get the output via:

room.simulate(recompute_rir=True)
output = room.mic_array.signals

Need we normalize the output to get the proper amplitude, and how to get there?

Thanks

Error: Unknown symbol: \bm (at char 0), (line:1, col:1)

I get an error when I want to execute the polar_plt_dirac() of the pyroomacoustics.doa.doa module.

ValueError:
\bm{\varphi}
^
Unknown symbol: \bm (at char 0), (line:1, col:1)

In the doa.py at line 486 is the issue:
ax.set_xlabel(r'azimuth $\bm{\varphi}$', fontsize=11)

I changed it to
ax.set_xlabel(r'azimuth ${\varphi}$', fontsize=11)
and it works.

When I googled the error I found this this Stack Overflow site.

They suggest to load some package in to TeX preamble. But they use the '\boldsymbol' command.

Add speed of sound as a parameter

It would be interesting to be able to add the speed of sound as a parameter for the room. This value seems to be hardcoded to 343 m/sec, but it may change according to the temperature within the room, and one may want to play with this parameter.

Thanks!

Zero-padding

Are you actually using zero-padding when computing the beamformer result? I saw that the default are set to 0, won't this lead to aliasing when using circular convolutions corresponding to the filter frequency-domain multiplication?

DOA using real data

Hello,

I am attempting to utilize the DOA algorithms for real world data sets from a mic array, and am curious as to what changes I would need to make or what I need to keep in mind when I am doing this.

how to set cardiac recv w/ multiple direction?

I have one question for microphone setting. multiple direction cardiac recv. for exmaple, 5 direction of "0, 45, 90, 135, 180 degree " on same position.

When I put 8 mic circular array
center=[20,20,1], mic=8, align=0, radius=0.1
Position was well but no cardiac microphone direction.

Any tip to implement the option to adjust amp for each config?

Question about the BSS eval in IVA script

Hi,
Thanks for this tool. Very impressive.
I saw that in "bss_iva.py" you used bss_eval_images in order to measure the preformances of the separation in terms of SIR, SDR, etc..
However it seems that you take into consideration only the first reference channel (the source after the RIR as it was received in the first mic).
Could it degrade the comparison? After all the estimated source is actually a convolutive combination of the signals from all mics, and the reference source is from one mic.

Thanks

Merge stft module with STFT class of realtime sub-package

Currently there are two concurrent implementations of STFT.

  1. in the stft.py module for doing STFT of a full signal at once
  2. in the realtime sub-package as the STFT class for doing only block-wise processing
    Ideally, the realtime.STFT class should be the one that remains and should be able to process multiple frames passed at once in an efficient way

Absorption of Sound in Air versus Humidity and Temperature

First of all, this is an awesome project, thank you for putting in the time and effort!

I was wondering how you deal with the absorption of sound. Do you consider this aspect at all or ignore it completely? If you do consider it, would it be possible to have parameters for humidity and temperature? This paper, for example, describes the various ways different frequencies are absorbed: http://gradbena.fizika.si/HarrisJASA66.pdf

Also, if I generate an impulse response (IR) then that should contain all the necessary information about reverb and distance, correct? So if I convolve an arbitrary sound signal with the recorded IR that should give me a realistic approximation not only including the reverb effect but also the absorption effect.

Again, thank you very much for this project,
Sylvus

Phase inversion upon sound reflection

Hi,
Does this implementation consider the phase inversion of sound upon each reflection at the room boundaries? If not, would it be difficult to implement this?

Separate microphones and beamformers objects

Currently, beamformers are a sub-class of microphone array objects. Ideally, the beamformers should have a structure more similar to the DOA or adaptive filters objects. They could be a subclass of STFT processing too.
It would be good that all the beamforming algorithms are also defined as sub-classes of a parent abstract class. This way, new algorithms could be easily added.

readme document example code

I just wanted to mention that in the pyroomacoustics example code of the readme document you have one typo that is a bit confusing:

# Create a linear array beamformer with 4 microphones
# with angle 0 degrees and inter mic distance 10 cm
R = pra.linear_2D_array([2, 1.5], 4, 0, 0.04)  <--------- should the 0.04 be changed to 0.1 for 10 cm ?

Thanks

Dimensionality Error with Room Simulation

Following the example code for room simulation found on the docs [https://pyroomacoustics.readthedocs.io/en/pypi-release/pyroomacoustics.room.html], I am running into a syntax error.

My code block:

room = pra.ShoeBox([9, 7.5, 3.5], fs=fs, absorption=0.35, max_order=17)

my_source = pra.SoundSource([2.5, 3.73, 1.76], signal=signal, delay=1.3)

# place the source in the room
room.add_source([2.5, 3.73, 1.76], signal)

R = np.c_[
    [6.3, 4.87, 1.2],  # mic 1
    [6.3, 4.93, 1.2],  # mic 2
    ]

# the fs of the microphones is the same as the room
mic_array = pra.MicrophoneArray(R, room.fs)

# finally place the array in the room
room.add_microphone_array(mic_array)

room.compute_rir()

room.simulate()

Note that fs and signal was already loaded. I am getting an error with FFT:

ValueError                                Traceback (most recent call last)
<ipython-input-5-e77621760d1a> in <module>()
----> 1 room.simulate()

~/anaconda3/lib/python3.6/site-packages/pyroomacoustics/room.py in simulate(self, snr, reference_mic, callback_mix, callback_mix_kwargs, return_premix, recompute_rir)
   1078                 d = int(np.floor(self.sources[s].delay * self.fs))
   1079                 h = self.rir[m][s]
-> 1080                 premix_signals[s,m,d:d + len(sig) + len(h) - 1] += fftconvolve(h, sig)
   1081 
   1082         if callback_mix is not None:

~/anaconda3/lib/python3.6/site-packages/scipy/signal/signaltools.py in fftconvolve(in1, in2, mode, axes)
    372         return in1 * in2
    373     elif in1.ndim != in2.ndim:
--> 374         raise ValueError("in1 and in2 should have the same dimensionality")
    375     elif in1.size == 0 or in2.size == 0:  # empty arrays
    376         return array([])

ValueError: in1 and in2 should have the same dimensionality

Cannot plot in 3-d

Hi,

I have tried plotting the microphones and source in 3-d, but every time I do that I get error: "IndexError: tuple index out of range". I looked under the room.py source file and I know that that capability exists, but maybe I am doing something wrong? Any help would greatly be appreciated.

Here's my code:

#Build 3-D Representation of the room
corners = np.array([[0,0],[15,0], [15,10], [0,10]]).T # This is our block
room = pra.Room.from_corners(corners)
room.extrude(10.) # For 3-d

#Add Source
fs, signal = wavfile.read("/Downloads/cmu_us_awb_arctic/wav/arctic_b0528.wav")
my_source = pra.SoundSource([1, 1, 0], signal=signal)
room.add_source(my_source)

#room = pra.Room.from_corners(corners, fs=fs)
#room.add_source([1.,1.,1.], signal=signal)

#Add 12-microphone array
R = np.c_[ # [x, y, z]
[2.85, 0.49, 0], # mic 1
[4.82, 0.49, 0], # mic 2
[5.80, 0.49, 0], # mic 3
[2.85, 2.65, 0], # mic 4
[4.82, 2.65, 0], # mic 5
[5.80, 2.65, 0] # mic 6
]

room.add_microphone_array(pra.MicrophoneArray(R, room.fs))

fig, ax = room.plot()
ax.set_xlim([-1, 15])
ax.set_ylim([-1, 10]);
ax.set_zlim([0, 15]); # for 3-d

High-resolution sampling

I was reading the paper https://arxiv.org/pdf/1712.03439.pdf mentioning that they sample the IR at 1024kHz, then resample to 128kHZ and do a low-pass filtering at 80kHz, then resample to the audio frequency of 16kHz

I have a few questions:

  • from their paper it seems that they seem to use 1024kHz sampling because of the distance between the microphones, would we expect to still have better results even in the single microphone case if we do a higher-resolution computation, then a resampling?

  • when I do the computation at 1024kHz and resample at 16kHz, the signal is not scaled in the same way, it seems that if I apply a scaling of 1024 / 16 I get comparable values for the max, although the resulting IR do not look the same (they seem slightly "shifted" in time). Is there a better way to do the resampling?

  • their paper shows how they do efficient OLA computations, is this what is used in pyroomacoustics, and would it require a lot of work to add it if it was needed?

Moving Sound Sources

Is it possible to use this package to simulate a moving sound source/moving microphone arrays?

Perhaps, setting a dense linear array of microphones and combining their shifted simulated recordings to simulate a change of the distance/angle between a single microphone and an audio source.

max_order changes output shape of RIR generator

When I change the maximum reflection order, it does strangely modify the output shape.

import numpy as np
import pyroomacoustics as pra

room_dimensions = np.asarray([4, 6, 3])
source_positions = np.asarray([[2.5, 4.5, 1.75], [1.4, 1.3, 1.8]])
sensor_positions = np.asarray([[2.1, 3.1, 1.1]])

for order in range(1, 20):
    room = pra.ShoeBox(room_dimensions, max_order=order)

    for source_position in source_positions:
        room.add_source(source_position)

    microphone_array = pra.MicrophoneArray(sensor_positions.T, 8000)
    room.add_microphone_array(microphone_array)
    room.compute_rir()
    print(f'order={order}: shape={np.asarray(room.rir).shape}')

Yields:

order=1: shape=(1, 2, 271)
order=2: shape=(1, 2)
order=3: shape=(1, 2, 564)
order=4: shape=(1, 2)
order=5: shape=(1, 2, 858)
order=6: shape=(1, 2)
order=7: shape=(1, 2)
order=8: shape=(1, 2)
order=9: shape=(1, 2, 1445)
order=10: shape=(1, 2)
order=11: shape=(1, 2, 1739)
order=12: shape=(1, 2)
order=13: shape=(1, 2, 2033)
order=14: shape=(1, 2)
order=15: shape=(1, 2, 2327)
order=16: shape=(1, 2)
order=17: shape=(1, 2, 2621)
order=18: shape=(1, 2)
order=19: shape=(1, 2, 2915)

trinicon test

Hi,
Have you ever tested trinicon.py? I tried to use trinicon using my own data, but I got very strange output.

Thanks

Python error: deallocating None

Hi,

First, let me thank you for sharing this great tool with the community. I'm using the room impulse response tool to generate a large amount of RIRs to perform data augmentation. Each time I run my script, I get this error at the 19208th call of the function compute_rir():

Current thread 0x00007f0e73bf3700 (most recent call first):
  File "[...]/.local/lib/python3.5/site-packages/numpy/core/_internal.py", line 477 in Stream
  File "[...]/.local/lib/python3.5/site-packages/numpy/core/_internal.py", line 449 in _dtype_from_pep3118
  File "[...]/.local/lib/python3.5/site-packages/numpy/ctypeslib.py", line 436 in as_array
  File "/usr/local/lib/python3.5/dist-packages/pyroomacoustics/room.py", line 652 in image_source_model
  File "/usr/local/lib/python3.5/dist-packages/pyroomacoustics/room.py", line 683 in compute_rir
  File "/usr/local/lib/python3.5/dist-packages/pydsp/simulator.py", line 33 in rir
  File "../batches/rooms_simultator.py", line 67 in <module>
  Fatal Python error: deallocating None

From my understanding, this error might be due to bad deallocation of memory in the C/C++ routine (I know pyroom uses C optimized code to generate RIR faster). Have you ever encounter this issue before?

Thank you,

Generating wav

Hi, I'm interested by creating a geometry and generating wav at the microphone location with different sources, as I'm new to room acoustics I'd like to know if you could give me a hint:

  • is there an example where I can create a room with a microphone and multiple sources, and "play" wave through the sources and record at the microphone, or do we have to get a RIR for each source and create the resulting .wav ourselves by convolving?
  • should we use 2d or 3d rooms when we want to do simple room acoustics and audio creation to simulate a real room condition?
  • how does the methods from this module compare to this kind of code http://www.eric-lehmann.com/ism_bg.html

thank you

Open room not supported

The simulator requires the room to be completely closed. This is due to a visibility check for the image sources. Microphones outside the room are not visible and the inside/outside test implicitely requires the room to be completely closed to make sense.

I think that this could potentially be fixed by better check of visibility for the image sources.

[next_gen_simulator, SOLVED] Error in setup

Dear developer,
I am trying to install pyroomacoustics/next_gen_sim.
Unfortunately I am encountering the following problem:

 ERROR: Complete output from command /home/ddicarlo/Documents/Code/InProgress/2019@dataset_aegean/venv/bin/python3 -c 'import setuptools, tokenize;__file__='"'"'/home/ddicarlo/Documents/Code/InProgress/2019@dataset_aegean/pyroomacoustics/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps:
    ERROR: running develop
    running egg_info
    writing pyroomacoustics.egg-info/PKG-INFO
    writing dependency_links to pyroomacoustics.egg-info/dependency_links.txt
    writing requirements to pyroomacoustics.egg-info/requires.txt
    writing top-level names to pyroomacoustics.egg-info/top_level.txt
    reading manifest file 'pyroomacoustics.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    warning: no files found matching 'pyroomacoustics/libroom_src/ext/eigen/COPYING.*'
    warning: no directories found matching 'pyroomacoustics/libroom_src/ext/eigen/Eigen'
    warning: no previously-included files matching '*.py[co]' found anywhere in distribution
    warning: no previously-included files matching '__pycache__' found anywhere in distribution
    writing manifest file 'pyroomacoustics.egg-info/SOURCES.txt'
    running build_ext
    gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.6m -c /tmp/tmp8atwvmaz.cpp -o tmp/tmp8atwvmaz.o -std=c++14
    gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.6m -c /tmp/tmpcxwjafcr.cpp -o tmp/tmpcxwjafcr.o -fvisibility=hidden
    building 'pyroomacoustics.libroom' extension
    gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I. -Ipyroomacoustics/libroom_src -I/home/ddicarlo/Documents/Code/InProgress/2019@dataset_aegean/venv/include/site/python3.6 -I/home/ddicarlo/Documents/Code/InProgress/2019@dataset_aegean/venv/include/site/python3.6 -Ipyroomacoustics/libroom_src/ext/eigen -I/usr/include/python3.6m -c pyroomacoustics/libroom_src/libroom.cpp -o build/temp.linux-x86_64-3.6/pyroomacoustics/libroom_src/libroom.o -DEIGEN_MPL2_ONLY -Wall -O3 -DVERSION_INFO="0.0.0" -std=c++14 -fvisibility=hidden
    In file included from pyroomacoustics/libroom_src/libroom.cpp:30:
    /home/ddicarlo/Documents/Code/InProgress/2019@dataset_aegean/venv/include/site/python3.6/pybind11/eigen.h:31:10: fatal error: Eigen/Core: No such file or directory
     #include <Eigen/Core>
              ^~~~~~~~~~~~
    compilation terminated.
    error: command 'gcc' failed with exit status 1

For reproducibility:

$ git clone https://github.com/LCAV/pyroomacoustics
$ cd pyroomacoustics/
$ git checkout next_gen_simulator
# in the virtualenv
(venv) $ pip install -e .  

EDIT
solved with the following

$ git clone https://github.com/LCAV/pyroomacoustics
$ cd pyroomacoustics/
$ git checkout next_gen_simulator
$ git submodule update --init --recursive
# in the virtualenv
(venv) $ pip install -e .  

Error in running test_doa.py

Hello, I hope you can assist me.

I am running test_doa.py and noticed that the number of points M is set to 12 rather than 6 (as stated in the comment line). When I replace the 12 with 6 an error is issued from utilities.py, line:
bank_flat[indices.ravel()] = windows * np.sinc(sinc_times.ravel())

circular microphone array, 6 mics, radius 15 cm

R = pra.circular_2D_array([0, 0], 12, 0., 0.15)

Thank you!

regarding source seperation

Hi, I am a new bee in the field of mic arrays and beam forming..what i am looking for is .., I have a circular array of 8 mics and I know there positions. I recorded sample and got results for DOA with several algorithms. Now as we know DOA of sources, can we separate those sources from the source wav file ?
And what exactly the beam forming code does?

How to use plot_beamformer_from_point

Hi all,

I was checking Beamformer class, can you explain how plot_beamformer_from_point function works or provide a simple example. Thanks in advance!

[next_gen_simulator] std::exception when creating room with non convex walls

Dear all,
I am trying to simulate RIR of a 3D non-convex room.
I was able to create the room and simulate the RIR from 3D mesh (.stl format).
(refer to this issues)
However, such mesh consisted in a collection of triangles.
For future use, I prefer to merge all the triangles in corresponding planes, the results of this computation can be found in this shared folder.

Unfortunately when using wall as non-convex polygon, pra.wall_factory() raises the following error:

Traceback (most recent call last):
  File "src/small_example_for_musis_with_non_cvx_wall.py", line 60, in <module>
    material.scattering["coeffs"],
  File "./pyroomacoustics/pyroomacoustics/room.py", line 315, in wall_factory
    name,
RuntimeError: std::exception

With the following code you can reproduce the issues:

import glob
import numpy as np
import pyroomacoustics as pra
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
import matplotlib.colors as colors


def surface_normal_newell(poly):
    n = np.array([0.0, 0.0, 0.0])
    for i, v_curr in enumerate(poly):
        v_next = poly[(i+1) % len(poly), :]
        n[0] += (v_curr[1] - v_next[1]) * (v_curr[2] + v_next[2])
        n[1] += (v_curr[2] - v_next[2]) * (v_curr[0] + v_next[0])
        n[2] += (v_curr[0] - v_next[0]) * (v_curr[1] + v_next[1])
    norm = np.linalg.norm(n)
    if norm == 0:
        raise ValueError('zero norm')
    else:
        normalised = n/norm
    return normalised

def plot_poligons(plane_list):
    figure = plt.figure()
    axes = mplot3d.Axes3D(figure)
    for plane_coords in plane_list:
        b = np.mean(plane_coords, 1)
        n = surface_normal_newell(plane_coords.T)*1000
        poly = mplot3d.art3d.Poly3DCollection(
            [list(zip(plane_coords[0, :], plane_coords[1, :], plane_coords[2, :]))],
            alpha=0.2)
        poly.set_color(colors.rgb2hex(np.random.random(3)))
        axes.quiver(b[0], b[1], b[2], n[0], n[1], n[2], color='red')
        axes.add_collection3d(poly, zs='z')
    return axes

# list the wall files
plane_files = glob.glob('./path/to/wall*.csv')
planes = []
for pfile in plane_files:
    plane_coords = np.loadtxt(pfile, delimiter=',')
    planes.append(plane_coords)

# plot planes
plot_poligons(planes)
plt.show()

# generate corresponding pra's walls
material = pra.Material.make_freq_flat(0.2)
walls = []
for p, plane in enumerate(planes):
    walls.append(
        pra.wall_factory(
            plane,
            material.absorption["coeffs"],
            material.scattering["coeffs"],
        )
    )

What am I doing wrong? Is it because of the non-convexity of the wall polygon?
Thank you very much

Create Anechoic / Free field environment

This would be useful to create anechoic simulations with properly simulated delays between microphones.
There isn't much work required to get there. It is already possible to achieve this result by creating a room (shoebox preferably) with max_order = 0. The only constraint then is to place the sources in the room, which is not really required if we think about this in terms of free field environment.
Maybe instead it would be possible to create a room without any walls.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.