Coder Social home page Coder Social logo

etdnn's Introduction

ECAPA-TDNN-for-Depression

Code for paper in "ECAPA-TDNN Based Depression Detection from Clinical Speech"

Environments

The experimental environment is listed here. Using different versions of packages may cause some problems.

python==3.7
torch==1.9.0
torchaudio=0.9.0
pandas==1.3.0
numpy==1.20.3
matplotlib==3.4.2
librosa==0.8.1
sklearn==1.0.2

DataSets

our corpus is recored between HAMD interview, only in audio modality. The corresponding corpus is described in detail in our atricle.

The corpus used in this paper is not publicly available at the moment, and this situation may improve in the future as we continue to collect and expand the existing dataset.

Features

MFCC is used as the feature that input to the neural network. Before feature extractoin, the raw speech data needs to be processed in the following steps:

  • The first thing to do is to separate the sound channels. As we collected the data, we save different speakers in different channels, i.e. the doctor's voice in the left channel and the subject's voice in the right channel.
  • The next steps is separating the noise from speech, this is implemented by Voice Activity Detection (VAD). The simplest double threshold method is used.
  • After processing the above steps, we obtained the subjects' speech of varying lengths. The speech is cut to three seconds with 50% overlap, and then MFCC features are extracted.

Models

we use ECAPA-TDNN for depression classification in this paper, for its excellent performance in various tasks.

ECAPA-TDNN was adjusted from speaker recognition to speech classification, in a senese, a simplified version.

for more detials about the models, please refer to ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification.

And in the subsequence experiments, the model was found to be significantly better than the models of DepAudioNet, ResNet, X-Vector, etc., not to mention some traditional manual feature methods.

ECAPA-TDNN has been validated to perform well on classification tasks, and we just use it for classification. However, this is also some interesting findings. In our experiemnts, we also extracted the embedding form model, and used T-SNE for visiualized the relationship between embeddings and speakers. we found speech embeddings spoken by one speakers always forming a cluster, and those from depressed speech were more dispersed. This remains to further verification.

Cited

@inproceedings{inproceedings,
  author = {Wang, Dong and Ding, Yanhui and Zhao, Qing and Yang, Peilin and Tan, Shuping and Li, Ya},
  year = {2022},
  month = {09},
  pages = {3333-3337},
  title = {ECAPA-TDNN Based Depression Detection from Clinical Speech},
  doi = {10.21437/Interspeech.2022-10051}
}

etdnn's People

Contributors

dong-8080 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

wy192

etdnn's Issues

Dataset available?

Hi,

it's good to have a new dataset for depression detection!

Is the dataset available now? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.