Coder Social home page Coder Social logo

garyzhao / frgan Goto Github PK

View Code? Open in Web Editor NEW
45.0 12.0 9.0 114.59 MB

The Pytorch implementation for "Learning to Forecast and Refine Residual Motion for Image-to-Video Generation" (ECCV 2018).

Home Page: https://garyzhao.github.io/FRGAN

License: Apache License 2.0

Python 94.57% C 5.43%
video-generation motion-forecasting residual-learning

frgan's Introduction

Learning to Forecast and Refine Residual Motion for Image-to-Video Generation (ECCV 2018)

This repository holds the Pytorch implementation of Learning to Forecast and Refine Residual Motion for Image-to-Video Generation by Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia and Dimitris Metaxas. If you find our code useful in your research, please consider citing:

@inproceedings{zhaoECCV18learning,
  author    = {Zhao, Long and Peng, Xi and Tian, Yu and Kapadia, Mubbasir and Metaxas, Dimitris},
  title     = {Learning to forecast and refine residual motion for image-to-video generation},
  booktitle = {European Conference on Computer Vision (ECCV)},
  pages     = {387--403},
  year      = {2018}
}

Introduction

We propose a two-stage generative framework where videos are forecasted from structures and then refined by temporal signals. In the forecasting stage, to model motions more efficiently, we train networks to learn residual motion between the current and future frames, which avoids learning motion-irrelevant details. In the refining stage, to ensure temporal consistency, we build networks upon spatiotemporal 3D convolutions. The code for training and testing our approach for facial expression retargeting on the MUG datatset is provided in this repository.

Note that the current version only contains the code for the forecasting stage. We are working on an improved version of the refining stage, which will come very soon.

We utilize 3DFFA to compute 3DMM for all face images in the dataset. Please refer to the corresponding part in our paper and the repository of 3DFFA for more details.

Quick start

This repository is build upon Python v2.7 and Pytorch v1.0.1. The code may also work with Pytorch v0.4.1 but has not been tested.

Installation

  1. Clone this repository. In Google Drive, download param_mesh.mat and put it into configs directory. Then download phase1_wpdc_vdc_v2.pth.tar and shape_predictor_68_face_landmarks.dat, and put them into models directory.

    >> git clone [email protected]:garyzhao/FRGAN.git
    >> cd FRGAN
    
  2. We recommend installing Python v2.7 from Anaconda, installing Pytorch (>= 1.0.1) following guide on the official instructions according to your specific CUDA version. In addition, you need to install dependencies below.

    >> pip install -r requirements.txt
    
  3. Build the C++ extension for computing normal maps from the 3D face model.

    >> cd dfa
    >> python setup.py build_ext --inplace
    >> cd -
    

Data preparation

Download MUG datatset and organize data like this

MUG
|-- 001
    |-- anger
        |-- take000
            |-- img_0000.jpg
            |-- img_0001.jpg
            |-- img_0002.jpg
            |-- ...
        |-- take001
            |-- img_0000.jpg
            |-- img_0001.jpg
            |-- img_0002.jpg
            |-- ...
        |-- ...
    |-- disgust
        |-- take000
            |-- img_0000.jpg
            |-- img_0001.jpg
            |-- img_0002.jpg
            |-- ...
        |-- ...
    |-- ...
|-- 002
    |-- ...
|-- ...

Then run the following script in the project directory to preprocess the data

>> python datasets/mug_process_dataset.py --inp_path $MUG_ROOT_PATH$ --out_path datasets/mug --out_size 64

Replace $MUG_ROOT_PATH$ by the path to the downloaded MUG directory. The preprocessed data will be saved in datasets/mug/mug64 directory. You can obtain results with higher resolutions by changing the out_size parameter (e.g., 64, 96 or 128).

Training

To train the network, try the following script in the project directory:

>> export PYTHONPATH=".:$PYTHONPATH"
>> export CUDA_VISIBLE_DEVICES=0,1
>> python mug_train_forecast.py --img_dir_path datasets/mug/mug64/ --batch_size 64 --num_epochs 100 --snapshot 2

Please modify CUDA_VISIBLE_DEVICES and batch_size according to your GPU settings. You may also change the value of img_size (64 by default) and h_dim (128 by default) if you train on images with higher resolutions. Please refer to mug_train_forecast.py for more details.

Testing

To test the network, try:

>> python mug_test.py --img_dir_path datasets/mug/mug64/ --out_dir_path examples --checkpoint $CHECKPOINT_PATH$

Replace $CHECKPOINT_PATH$ by the path to the checkpoint saved during training.

frgan's People

Contributors

garyzhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

frgan's Issues

About MUG dataset

Hi!
Data set application has not been answered for a long time, could you help with me?
Thank you so much!
Best Regards!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.