Coder Social home page Coder Social logo

mickowale / reconstructing_faces_from_voices Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cmu-mlsp/reconstructing_faces_from_voices

0.0 1.0 0.0 164.54 MB

An example of the paper "reconstructing faces from voices"

License: GNU General Public License v3.0

Python 99.61% Shell 0.39%

reconstructing_faces_from_voices's Introduction

Reconstructing faces from voices

Implementation of Reconstructing faces from voices paper

Yandong Wen, Rita Singh, and Bhiksha Raj

Machine Learning for Signal Processing Group

Carnegie Mellon University

Requirements

This implementation is based on Python 3.7 and Pytorch 1.1.

We recommend you use conda to install the dependencies. All the requirements are found in requirements.txt. Run the following command to create a new conda environment using all the dependencies.

$ ./install.sh

After you run the above script, you need to activate the environment where all the packages had been installed. The environment is called voice2face and can be run by:

$ source activate voice2face

NOTE: If you get an error complaining about "webrtcvad" not being found, then you need to make sure the pip in your PATH is the one found inside your environment. This could happen if you have multiple installations of pip (inside/outside environment).

Processed data

The following are the processed training data we used for this paper. Please feel free to download them.

Voice data (log mel-spectrograms): google drive

Face data (aligned face images): google drive

Once downloaded, update variables voice_dir and face_dir with the corresponding paths.

Configurations

See config.py on how to change configurations.

Train

We provide pretrained models including a voice embedding network and a trained generator in pretrained_models/. Or you can train your own generator by running the training script

$ python gan_train.py

The trained model is models/generator.pth

Test

We provide some examples of generated faces (in data/example_data/) using the model in pretrained_model/. If you want to generate faces for your own voice recordings using the trained model, specify the test_data (as the folder containing voice recordings) and model_path (as the path of the generator) variables in config.py and run:

$ python gan_test.py

Results will be in test_data folder. For each voice recording named <filename>.wav, we generate a face image named <filename>.png.

Note: Now we only support the voice recording with one channel at 16K sample rate. The file names of the voices and faces starting with A-E are validation or testing set, while those starting with F-Z are training set.

Citation

@article{wen2019reconstructing,
  title={Reconstructing faces from voices},
  author={Yandong Wen, Rita Singh, Bhiksha Raj},
  journal={arXiv preprint arXiv:1905.10604},
  year={2019}
}

Contribution

We welcome contributions from everyone and always working to make it better. Please give us a pull request or raise an issue and we will be happy to help.

License

This repository is licensed under GNU GPL-3.0. Please refer to LICENSE.md.

reconstructing_faces_from_voices's People

Contributors

ydwen avatar mahmoudalismail avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.