Coder Social home page Coder Social logo

deep-clustering's Introduction

A tensorflow implementation of deep clustering for speech seperation

This is a tensorflow implementation of the deep clustering paper: https://arxiv.org/abs/1508.04306 A few exmaples from the test set can be viewed in visualization_samples/ and speech_samples/

Requirements

Python 2 and its packages:

  • tensorflow r0.11
  • numpy
  • scikit-learn
  • matplotlib
  • librosa

File documentation

  • GlobalConstant.py: Gloabl constants.
  • datagenerator.py: Transform seperate speech files in a dir into .pkl format data set.
  • datagenerator2.py: A class to read the .pkl data set and generate batches of data for training the net.
  • model.py: A class defining the net structure.
  • train_net.py: Train the DC model.
  • mix_samples.py: Mix up two pieces of speech signals for test.
  • AudioSampleReader.py: Transform a .wav file into chunks of frames to be fed to the models during test.
  • visualization_of_samples.py: Visualize the active embedding points using PCA.
  • audio_test.py: Take in two speaker mix sample and seperate them.

Training procedure

  1. Orgnize your speech data files as the following format: root_dir/speaker_id/speech_files.wav
  2. Make some changes dir of the datagenerator.py and run it, you may want to rename the .pkl file properly.  3. Make dirs for write summaries and checkpoints, update your dirs in the train_net.py. The changes of the .pkl file list for     training and validation are also need to be made.
  3. Train the model.
  4. Generate some mixtures using mix_samples.py, and modify the checkpoints in audio_test.py.
  5. Enjoy yourself!

Some other things

The optimizer is not the same as that in the original paper, and also no 3 speaker mixture generator is provided, and we are moving on to the next stage of work and will not bother to do that. If you are interested and implemente that, we are glad to merge your branch.

References

https://arxiv.org/abs/1508.04306

deep-clustering's People

Contributors

zhr1201 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.