Coder Social home page Coder Social logo

9143-hpml-project's Introduction

9143-HPML-Project

  • Project title: Voice Separation and Optimization in Recurrent Neural Networks
  • Team members: Leo Hu, Junda Ai

Introduction

In this project we looked into two state-of-the-art voice separation model implementations, and anaylzed the effects of different optimization algorithms and techniques on them.

Spleeter

An optimizer modification of Spleeter by deezer

Installation

Environment

# Install media/file dependencies using conda
conda install -c conda-forge ffmpeg libsndfile
# Clone spleeter repository
git clone https://github.com/Deezer/spleeter && cd spleeter
# Install poetry
pip install poetry
# Install spleeter dependencies
poetry install
# Run unit test suite
poetry run pytest tests/

To enable GPU

# Uninstall CPU tensorflow
pip uninstall tensorflow
# Install GPU tensorflow
pip install tensorflow-gpu==2.5.0

Modification

# Replace spleeter/model/__init__.py with __init__.py in spleeter_mod
# Replace spleeter/__main__.py with __main__.py in spleeter_mod
# Place spleeter_entry.py in base directory

Modification Base/
├─ spleeter (forked)/
│  ├─ model/
│  │  ├─ __init__.py (modified)
│  ├─ __main__.py (modified)
├─ spleeter_entry.py

Preparation

# prepare the dataset: MUSDB18-HQ is used
# prepare the csv: based on dataset used
# modify the config: modify the links in the config file

Training

  • To test different optimizers, modify the optimizer item in the config file
  • Logging frequency can be specified in __main__.py

when everything is ready:

python spleeter_entry.py train -p path/to/spleeter_config.json -d path/to/datasets --verbose

Result

Results for different optimizers are saved in spleeter-results directory. spleeter_parse_data.ipynb provides data parsing functions and graphing functions for data parsing and visualization.

Training Loss of first 8 steps using different optimizers:

Losses

More graphs can be found in the img directory.


SVoice

SVoice: Speaker Voice Separation using Neural Nets

  • Hydra to manage training configurations
  • Custom implementation of SI-SNR loss function
  • Adam optimizer
  • Gradient clipping

Getting started

SVoice experiment was carried out on Google Colab.

Jupyter notebooks for ploting validation results:

Results

Please find all the training outputs in svoice-results

  • Adam yields the best results among gradient descent optimizers
  • Plateau learning rate (decay when a metric stops improving) scheduler with Adam works better than StepLR (step-wise decay)
  • Gradient clipping further improves performance

Gradient descent optimizers:

Optimizers

LR schedulers:

LR schedulers

Gradient clipping:

Gradient clipping

9143-hpml-project's People

Contributors

alan052918 avatar leohu97 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.