Coder Social home page Coder Social logo

mendaxfz / synthetic-voice-detection Goto Github PK

View Code? Open in Web Editor NEW
8.0 1.0 6.0 7.39 MB

Google-TTS-Wavenet/Amazon-Polly/Microsoft-TTS/DeepVoice3

License: GNU Affero General Public License v3.0

Python 3.95% Jupyter Notebook 96.05%
machine-learning digital-signal-processing google-tts-wavenet deepfake-detection

synthetic-voice-detection's Introduction

Synthetic-Voice-Detection

Dataset

Dataset - http://bil.eecs.yorku.ca/datasets/

  • Total of 17,870 utterances.
  • Training Set - Contains 77.3% of the dataset, used to train the ML model. Gender and class balanced.
  • Validation Set - Contains 15.58% of the dataset, used to validate the accuracy of the ML model. Gender and class balanced.
  • Generalization Testing - Contains 6.68% of the dataset. Contains only voices from one unseen algorithm(Google TTS Wavenet) and unseen real voices. Gender and class balanced. Used to check if the model has successfully learnt to generalize.

Feature Extraction

Extracting MFCCs essentially summarises the frequency distribution across a window size, this makes it possible to analyse both frequency and temporal characteristics of a sound wave. I have used the Librosa library, alongside a custom script to extract the MFCCs of all the audio files in the dataset and saved the processed dataset as a HDF5 file. This downsized the dataset by about 5 times.

Network Architecture

Resources - https://arxiv.org/pdf/1812.00149.pdf

Future Tasks:

  • Data Augmentation - more data will help the model generalise with higher levels of accuracy. Augmentations such as noise injection, time shifting, pitch change and speed change could be implemented with Numpy & Librosa.
  • Exploring the ​for-rerec​ dataset and p​ ossible merge​ with the ​for-2seconds dataset, will help the model in situations where the attacker may play the synthetic speech with one device and record it with another device. This will make the model more resilient.
  • Exploring Constant-Q transform(CQT) instead of MFCC for feature extraction. This could be implemented with Librosa.
  • Transfer Learning with model architectures such as Inception and MobileNet with ImageNet weights - this might improve the classification accuracy, however these are relatively large models and might sacrifice speed in a real-time classification scenario.

synthetic-voice-detection's People

Contributors

mendaxfz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

synthetic-voice-detection's Issues

Labelisation

Hey !

thanks for sharing the project. My question is about the data labelisation. I read the dataset documentation and your readme but anywhere I found if '0' means Speech or '1' ?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.