Coder Social home page Coder Social logo

kws_pytorch's Introduction

KWS_pytorch

Keyword spotting, Speech wake_up, pytorch, DNN, CNN, TDNN, DFSMN, LSTM

Introduction

  • The project is based on ICASSP 2014 paper Small-footprint keyword spotting using deep neural networks.

  • We implement the idea with various deep neural network architecture, e.g.,DNN, CNN, TDNN, DFSMN, LSTM.

  • The project can be applied to several tasks, such as key-word spotting and speech wake-up.

Documents

  • command_loader.py: CommandLoader is defined for data extraction. The data is structured as follow

    • path/key words/audio file (.wav)
  • model.py: Implementation of several backbones: DNN, CNN, TDNN, DFSMN, LSTM.

  • train.py: Definition of training & testing process.

  • run.py: Main program for training & testing. Possible parameters are explained below: +

Datasets

Speech wake-up:

  • MobvoiHotwords: A corpus of wake-up words collected from a commercial smart speaker of Mobvoi.
    • Containing audio of "Hi xiaowen" and " Nihao Wenwen", as well as noise speech.
    • Homepage

Key-word spotting:

  • Synthetic Speech Commands Dataset.
    • Consisted of key-words audio of thirty categories, e.g., "bed", "bird", "cat", "dog", "eight", "five", "stop", "wow", "zero".
    • Download link

Visualization of Results

Key-word spotting:

  • Batchnums-Accuracy curve with STFT:

acc

  • Batchnums-Accuracy curve with Deep KWS:

acc_KWS

  • Accuracy:
Module Epoch1 Epoch2 epoch3 epoch4 epoch5 text
DNN 38.57% 52.85% 58.81% 67.48% 71.00% 62.59%
CNN 95.30% 96.12% 96.30% 97.20% 96.75% 95.17%
TDNN 70.10% 69.02% 74.35% 77.87% 80.76% 76.50%
LSTM 57.36% 74.35% 75.16% 79.31% 81.39% 78.75%
DFSMN 91.15% 92.14% 94.94% 93.86% 94.04% 90.34%
DNN(KWS) 87.97% 90.37% 91.04% 91.18% 90.44% 89.67%

Speech wake-up:

  • Accuracy with Deep_KWS:

acc

  • Loss with Deep_KWS:

loss

kws_pytorch's People

Contributors

hongfeixue avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.