The speech_commands_distillation_torch_lightling from egochao

1. What we are doing
1.1. The problems:
- What we are concerning about
- What we are doing here
2. Share development env by VS code remote container
3. Steps to train the model
4. Steps to test the model
5. The result
6. Other Development setups
- Poetry usage
7. Link to trained model + resource

1. What we are doing

1.1. The problems:

Key word spotting in audio.
Dataset : Speech Command

What we are concerning about

High accuracy on the test set
Super small model size for edge deployment

What we are doing here

We will use model distillation to pass knowledge from a big model to a small one
We will use Optuna for parameters search
We will use Torch lightling as boilerplace for this project
We will use weight and bias as monitoring tool

2. Share development env by VS code remote container

Please read though some concept here

This will spin up the development environment with minimal setup.

Install and configure git password manager - this will help to share git configuration to the container
Run the "Remote-Containers: Reopen in Container" command

3. Steps to train the model

Train simple convolution

    python train.py

Train Bc ResNet model

    python train.py --model bc_resnet

4. Steps to test the model

Train simple convolution

    python test.py --pretrain path_to_pretrain

Train Bc ResNet model

    python test.py --model bc_resnet --pretrain path_to_pretrain

5. The result

a. Model include in the work - No parameters search

Model	Description	Params	Model accuracy
Simple Convolution	A straight forward 1D convolution	26900	94.2%
BC Resnet	Experiment logging	10600	95.6%

b. Model optimized with Optuna

Model	Description	Params	Model accuracy
Simple Convolution	A straight forward 1D convolution	35000	95.1%
BC Resnet	Experiment logging	22000	98.3% - best

c. Model train with distillation loss

Model	Description	Params	Model accuracy
Simple Convolution	A straight forward 1D convolution	28600	90.3%
BC Resnet	Experiment logging

d. Hightlight point

My best model have 22k parameters and accuracy on test set = 98.3% (Optuna optimized)
Almost beat the state-of-art(98.5)
The model size is superior compare with all other state-of-art model by some order of magnitude
The distillation process is not success and it causing the model perform worst than non distill

6. Other Development setups

Poetry usage

Install depenedencies

    pip install poetry
    poetry install

Add new dependencies

    poetry add package_name

7. Link to trained model + resource

Logit data
Best simple convolution : 95.1% link to wandb report
Best Bc ResNet : 98.3% link to wandb report

egochao / speech_commands_distillation_torch_lightling Goto Github PK

speech_commands_distillation_torch_lightling's Introduction

1. What we are doing

1.1. The problems:

What we are concerning about

What we are doing here

2. Share development env by VS code remote container

3. Steps to train the model

4. Steps to test the model

5. The result

a. Model include in the work - No parameters search

b. Model optimized with Optuna

c. Model train with distillation loss

d. Hightlight point

6. Other Development setups

7. Link to trained model + resource

speech_commands_distillation_torch_lightling's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org