Coder Social home page Coder Social logo

whisper_set_1's Introduction

WHISPER_SET_1

This repo contains metatadata and scirpts for using the WHISPER SET #1 a dataset for speech enhancement and source separation recorded with a Wireless Acoustic Sensor Network (WASN) called WHISPER Kiselev2018. The dataset contains samples for up to 4 concurrent speakers and speech in noise. The dataset was recorded in a room with low reverberation (T_60 = 0.2 s) and using 16 microphones. In general each track contains first a calibration phase where each of the speakers sequentially are active alone for 15 seconds. Followed by 15 seconds of all the speakers together (plus noise in some cases)

===

The raw data can be found here. Each sample is a 16-channel wav file in which the order of the channel follows the following logic:

0 - module 5 mic 1 1 - module 5 mic 2 2 - module 5 mic 3 3 - module 5 mic 4 4 - module 6 mic 1 5 - module 6 mic 2 6 - module 6 mic 3 7 - module 6 mic 4 8 - module 7 mic 1 9 - module 7 mic 2 10 - module 7 mic 3 11 - module 7 mic 4 12 - module 8 mic 1 13 - module 8 mic 2 14 - module 8 mic 3 15 - module 8 mic 4

Refer to the floor plan for a visual illustration of the microphones arrangements.

Samples are ordered in folders. Each folder contains a 15 seconds 16-cahnnels mixture.wav file, plus the 10 seconds 16-channels calibX.wav files one for each speaker or noise in the mixture. That is a mixture with 4 speakers will have 4 calibration files and a mixture of a speaker plus noise will have 2 calibration files one for speech (1) and one for noise (2).

===

Info

Annotated floor plan

Floor plan annotated

The microphones are numbered in such a way that the first digits represents the module and the second the index in the module.

Recordings text

Recordings output

The recordings csv contains the following columns:

  • timestamp: refers to the file name track_.wav
  • spk1: first speaker position (order in which it appears during calibration time)
  • spk2: second speaker position (order in which it appears during calibration time)
  • spk3: third speaker position (order in which it appears during calibration time)
  • spk4: fourth speaker position (order in which it appears during calibration time)
  • nspk: number of speakers active
  • snr
  • noise: if babble noise was present or not

For the separation the tracks will have 15s of spk1, then 15s of spk2, ..., 15 seconds of concurrent speech.

For the enhancement the tracks will have 15s of spk1, 15s of noise then 15 of noisy speech. The noise is diffuse noise

Microphones position

Overall the microphones cover and area which is roughly a oval with major axis 2.3m and minor axis 1.5m. With that said

- The minimum distance between 2 mics was 0.1m (54 64)
- The maximum distance between 2 mics was 2.3m (73 82)
- The average is roughly 0.5m

Speakers position

The speakers are 4. 2 are put is close distance to simulate a conversation.

- The minimum distance between 2 speakers is 0.75m
- The maximum distance between 2 speakers is 1.90m
- The average distance between any 2 speakers is 1.5m

Speakers to mics

For each speaker we report the closest and the furthest microphone

1 - min 0.55m max 1.80m
2 - min 0.50m max 1.90m
3 - min 0.30m max 2.40m
4 - min 0.70m max 2.30m

whisper_set_1's People

Contributors

enny1991 avatar sensorsaudioini avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.