Coder Social home page Coder Social logo

verojulianaschmalz / e2e-sentence-classification-on-fluent-speech-commands Goto Github PK

View Code? Open in Web Editor NEW
4.0 3.0 3.0 45 KB

Intent Classification using the Fluent Speech Commands Dataset

Python 100.00%
end2endslu fluentspeechcommands spokenlanguageunderstanding sentence-classification intent-detection slu

e2e-sentence-classification-on-fluent-speech-commands's Introduction

End-to-End Sentence-Classification-on-FSCs-Schmalz

End-to-End Sentence Classification based on the Fluent Speech Commands Dataset

Using the opensource Fluent Speech Commands dataset (available at https://fluent.ai/fluent-speech-commands-a-dataset-for-spoken-language-understanding-research), we consider an End-2-End phrase classification task. The dataset contains 30,043 utterances and 248 possible sentences. The utterances are pronounced by both native and non-native English speakers and include phrases like "turn off the lights in the kitchen" or "heat up in the living room", while the possible sentences are intended to define the action, object and location of them, for example "deactivate, lights, kitchen" or "increase, heat, living room" for the previously mentioned phrases.

In order to address the task we adopt a neural framework that classifies the utterances to the possible sentences. The model used in the experiment is a Time Convoluted Network (TCN, available at https://github.com/asteroid-team/asteroid from https://github.com/popcornell/OSDC. The network receives as input fixed length sequences of 40 Mel filter-banks. The signal length is limited to 4 seconds or 64000 samples. Filter banks are computed on 20ms window with 10ms hop size, resulting in 400 frames.

Pre-processing

  • cutfiles.py:
fcut,index= librosa.effects.trim(f,frame_length=2098, hop_length=562)

./fluent_speech_commands_dataset/ identifies the folder in which the data are to be found.

Training

  • main.py :
python3.6 main.py -n TCN -m models/tcn_b5r2.pkl -b 5 -r 2 -lr 0.001 -e 100
  • -n: type of net
  • -m: model
  • -b: number of blocks of the TCN network
  • -r: number of repeats of the TCN network
  • -lr: learning rate
  • -e: number of epochs

Evaluation

  • evaluation.py:
python3.6 evaluation.py -n TCN -m models/tcn_b5r2.pkl -b 5 -r 2 

Using the training parameters suggested above, the obtained results should be 0.816870 accuracy on the validation set and 0.934880 accuracy on the evaluation set.

e2e-sentence-classification-on-fluent-speech-commands's People

Contributors

abrutti avatar verojulianaschmalz avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

e2e-sentence-classification-on-fluent-speech-commands's Issues

Path dataset

Aggiungere come parametro il path al dataset (ora e' "fluent_speech_command_dataset" ed e' hard coded nel testo).
Sia su main.py che su evaluation.py

dataset

As the official link does not work, could the dataset be shared?

Numero workers

-Mettere lo stesso numero di workers per tutti i dataloader (sia main che evaluation).
-aggiungere la possibilita' di selezionarlo come argomento (6 workers potrebbero essere troppi per certe macchine)

Rimuovere gli altri modelli

Togliere le linee commentate in main.py, il blocco if then else in evaluation.py e il parametro -n (quello si puo' lasciare commentato)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.