Coder Social home page Coder Social logo

vipchengrui / masg Goto Github PK

View Code? Open in Web Editor NEW
32.0 2.0 12.0 21.63 MB

microphone array speech generator (MASG) in room acoustic

License: MIT License

Python 0.26% Jupyter Notebook 99.74%
speech speech-processing microphone-array-processing speech-enhancement speech-codec dataset-generation

masg's Introduction

image_logo

MASG

GitHub release license

microphone array speech generator (MASG) in room acoustic.

Abstract

It is used to simulate the speech data received by microphone array of various shapes in room acoustic environment, including clean speech (clean), reverberation speech (clean rever), noisy speech (clean noise), noisy and reverberation speech (clean rever niose) and corresponding noise signal (noise).

Method

The MASG is implemented based on two tools, namely, Pyroomacoustic [1] and an improved version add_noise_for_multichannel with add noise [2]. The schematic diagram of MASG is shown in Fig. 1.

image_method

Fig. 1 The schematic diagram of MASG.

Based on Pyroomacoustic, the microphone array clean speech is obtained by setting the absorption to 1.0, and the microphone array reverberation speech is obtained by setting the absorption to less than 1.0. With the microphone array clean speech, combined with the noise signal and the expected signal-to-noise ratio (SNR), we can get the corresponding microphone array noise signal, and combine them with the microphone array clean speech and the microphone array reverberation speech to get the microphone array noisy speech and the microphone array noisy reverberation speech.

From this, we can get the simulation data of all microphone arrays used in indoor acoustic environment.

Simulation Environment

In order to verify the effect of the MASG, we set up a common room acoustic environment in our life, meeting room scene. This scenario is shown in Fig. 2.

image_room

Fig. 2 Meeting room acoustic environment.

The scene simulates a meeting room with a length of 4m, a width of 3m and a height of 3m. In this room, a 2.2mx1.1mx0.75m conference table, 19 chairs with possible target sound source, and an audible screen are respectively placed. Their coordinates and details are shown in Fig. 2.

Based on such a meeting room environment, we abstract the room, microphone array, target source and other information used to make the data set, and get the simulation environment as shown in Fig. 3.

image_room_model

Fig. 3 The simulation environment.

Program List

The MASG is implemented with Python. The detailed packages and functions are as follows.

Packages

[numpy] https://numpy.org/ https://pypi.org/project/numpy/

[matplotlib] https://matplotlib.org/ https://pypi.org/project/matplotlib/

[scipy] https://www.scipy.org/ https://pypi.org/project/scipy/

[pyroomacoustic] https://github.com/LCAV/pyroomacoustics https://pypi.org/project/pyroomacoustics/

Functions

[add_noise_for_multichannel.py] This function is used to add noise to microphone array clean speech and microphone array reverberation speech based on the expected SNR.

[microphone_array_speech_generator_for_test_dataset.py] This function is used to generate a microphone array speech test dataset for room acoustic environment.

[microphone_array_speech_generator_for_train_dataset.py] This function is used to generate a microphone array speech training dataset for room acoustic environment.

[speech_connection.py] This function is used to implement speech connection.

References

[1] R. Scheibler, E. Bezzam and I. Dokmanić, "Pyroomacoustics: A Python Package for Audio Room Simulation and Array Processing Algorithms," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, 2018, pp. 351-355.

[2] ITU-T (1993). Objective measurement of active speech level. ITU-T Recommendation P. 56.

masg's People

Contributors

vipchengrui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

masg's Issues

zeros in ASL_P56

Hello! Your implementation of def asl_P56(x, fs, nbits) in MASG/microphone_array_speech_generator/add_noise_for_multichannel.py always return (0,0,0) for any 16-bit depth 44100Hz wav file. Do you have the same problem? Am I doing something wrong?

Best regards!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.