Coder Social home page Coder Social logo

brownsugarzeer / multi_ssl Goto Github PK

View Code? Open in Web Editor NEW
44.0 1.0 12.0 18.21 MB

Combine sound source separation with SRP-PHAT to achieve multi-source localization.

License: MIT License

Python 100.00%
sound-source-localization sound-source-separation duet srp-phat ssl bss

multi_ssl's Introduction

3D Multiple Sound Sources Localization (SSL)

The Steered Response Power Phase Transform (SRP-PHAT) is an important and robust algorithm to localize acoustic sound sources. However, the algorithm can only give us one location estimation. For multi-sources extension, we propose to use the Degraded Unmixing Estimation Technique (DUET) to separate each source and pass it to the SRP-PHAT algorithm to achieve multi-sources tracking.

Prepare an Environment

git clone https://github.com/BrownsugarZeer/Multi_SSL.git
cd Multi_SSL
python -m venv venv
venv\Scripts\activate.bat
pip install -r requirements.txt

Pyaudio requires some tricks to install on Windows. If the installation fails, finding unofficial wheels may be a available solution.

Hardware

The board is a far-field microphone array device capable of detecting voices up to 5m away even with the presence of background noise.

Running an Experiment

  1. Using a microphone stream (online)
(venv) > python srp_phat_online.py  -s=1
Find 1 available sources.
azi:  184.4, ele:   46.4
===================================================
Find 1 available sources.
azi:  184.4, ele:   46.4
===================================================
Find 1 available sources.
azi:  276.1, ele:   39.2
===================================================
...
  1. Using an audio file (offline)
# Automatically determine the number of sources
(venv) > python srp_phat_offline.py -s=1 -c=4 -i=None --wave=data/a0e20/50cm/a0e19_3_1b6ede00.wav
Find 1 available sources.
azi:    0.3, ele:   22.7

(venv) > python srp_phat_offline.py -s=2 -c=4 -i=None --wave=data/a0e20_a45e35/150cm/a0e19_a44e34_3_1c91d780.wav
Find 2 available sources.
azi:   50.8, ele:   43.2
azi:    2.7, ele:   26.2

Visualization

To easily show what's going on, we use plotly to plot the DOA on a sphere which diameter is 1 meter. The center of the sphere is the microphone array we place at p(x=0, y=0, z=0), the dark blue dots are the Directions of Arrival (DOA), and the lighter dots are the projections on each plane.

(venv) > python srp_visualizer.py -s=1 --wav=data/a0e20/50cm.csv

50cm

150cm

250cm

Issue

  1. The algorithm has a high computational complexity thus making the algorithm unsuitable for real time applications. For estimating one source we need at least 0.3 seconds, estimating N sources we need at least (0.3*N) seconds,

References

  1. S. Rickard, "The DUET blind source separation algorithm." Blind Speech Separation, pp. 217-241, 2007.

  2. Dey, Ajoy Kumar, and Susmita Saha. "Acoustic Beamforming: Design and Development of Steered Response Power With Phase Transformation (SRP-PHAT)." (2011).

  3. Ravanelli, Mirco, et al. "SpeechBrain: A General-Purpose Speech Toolkit." arXiv preprint arXiv:2106.04624 (2021).

multi_ssl's People

Contributors

brownsugarzeer avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

multi_ssl's Issues

azi ele

Checked other resources

  • Confirm and check all the following options.

Issue with current documentation

Can you tell me how you calculated azi and ele?thank you very much

Idea or request for content

Can you tell me how you calculated azi and ele?thank you very much

Further Information

Can you tell me how you calculated azi and ele?thank you very much

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.