gedebabin / dbt-net Goto Github PK

View Code? Open in Web Editor NEW

The audio demos with respect to the paper "DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement" are provided (submitted to TASLP). The code will also be released soon.

Python 100.00%

dbt-net's Introduction

DBT-Net

The audio demos with respect to the paper "DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement" are provided (Accepted by IEEE TASLP). The code and the pretained model is also released.

Overall architecture:

Code:

You can use dual_aia_trans_merge_crm() in aia_trans.py for dual-branch SE, while aia_complex_trans_mag() and aia_complex_trans_ri() are single-branch aprroaches. The trained weights on VB dataset, 30h WSJ0-SI84 datset and 300h 2020 DNS-Challenge are also provided. You can directly perform inference or finetune the model by using vb_aia_merge_new.pth.tar.

requirements:

CUDA 10.1
torch == 1.8.0
pesq == 0.0.1
librosa == 0.7.2
SoundFile == 0.10.3

How to train

Step1

prepare your data. Run json_extract.py to generate json files, which records the utterance file names for both training and validation set

# Run json_extract.py
json_extract.py

Step2

change the parameter settings accroding to your directory (within config_vb.py or config_dns.py)

Step3

Network Training (you can also use aia_complex_trans_mag() and aia_complex_trans_ri() network in aia_trans.py for single-branch SE)

# Run main_vb.py or main_dns.py to begin network training 
# solver_merge.py and train_merge.py contain detailed training process
main_vb.py

Inference:

The trained weights are provided in BEST_MODEL.

# Run enhance_vb.py or enhance_wsj.py to enhance the noisy speech samples.
enhance_vb.py

gedebabin / dbt-net Goto Github PK

dbt-net's Introduction

DBT-Net

Overall architecture:

Code:

requirements:

How to train

Step1

Step2

Step3

Inference:

Experimental Results

WSJ0-SI84 Dataset

DNSMOS

Voice-Bank + Demand dataset

Spectrogram Visualization

dbt-net's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent