Coder Social home page Coder Social logo

jkvt2 / latent-pose-descriptors Goto Github PK

View Code? Open in Web Editor NEW

This project forked from vincent-thevenin/realistic-neural-talking-head-models

5.0 1.0 1.0 65.78 MB

My implementation of Neural Head Reenactment with Latent Pose Descriptors (Egor Burkov et al.).

License: GNU General Public License v3.0

Python 100.00%

latent-pose-descriptors's Introduction

Realistic-Neural-Talking-Head-Models

My implementation of Neural Head Reenactment with Latent Pose Descriptors (Egor Burkov et al.). https://arxiv.org/pdf/2004.12000 Forked from https://github.com/vincent-thevenin/Realistic-Neural-Talking-Head-Models.

Source video:

Driving and output video:

Steps:

0. modify params/params.py as required

  • get VGGFace pretrained model according to prerequisites section below
  • get VGG19 pretrained model from default torchvision models
  • for the demo, you can use a video: specify it with path_to_pose_video and the images will be automatically cropped from it

1. run train.py to get model_weights.tar (the meta-learning model)

2. run embedder_inference.py (requires model_weights.tar) to get e_hat_images.tar (the embedding of the identity images)

3. run finetuning_training.py (requires model_weights.tar, e_hat_images.tar) to get finetuned_model.tar (the finetuned model)

4. run demo.py (requires finetuned_model.tar, e_hat_images.tar) to get gif.gif in vis/

Prerequisites

1.Loading and converting the caffe VGGFace model to pytorch for the content loss:

Follow these instructions to install the VGGFace from the paper (https://arxiv.org/pdf/1703.07332.pdf):

$ wget http://www.robots.ox.ac.uk/~vgg/software/vgg_face/src/vgg_face_caffe.tar.gz
$ tar xvzf vgg_face_caffe.tar.gz
$ sudo apt install caffe-cuda
$ pip install mmdnn

Convert Caffe to IR (Intermediate Representation)

$ mmtoir -f caffe -n vgg_face_caffe/VGG_FACE_deploy.prototxt -w vgg_face_caffe/VGG_FACE.caffemodel -o VGGFACE_IR

If you have a problem with pickle, delete your numpy and reinstall numpy with version 1.16.1

IR to Pytorch code and weights

$ mmtocode -f pytorch -n VGGFACE_IR.pb --IRWeightPath VGGFACE_IR.npy --dstModelPath Pytorch_VGGFACE_IR.py -dw Pytorch_VGGFACE_IR.npy

Pytorch code and weights to Pytorch model

$ mmtomodel -f pytorch -in Pytorch_VGGFACE_IR.py -iw Pytorch_VGGFACE_IR.npy -o Pytorch_VGGFACE.pth

At this point, you will have a few files in your directory. To save some space you can delete everything and keep Pytorch_VGGFACE_IR.py and Pytorch_VGGFACE.pth

2.Libraries

  • face-alignment
  • albumentations
  • torch
  • numpy
  • cv2 (opencv-python)
  • matplotlib
  • tqdm

3.VoxCeleb2 Dataset

I used the version of VoxCeleb2 as described in https://github.com/AliaksandrSiarohin/video-preprocessing. Note that this is substantially different from the origninal paper, which centers the face at every frame. This method does however make it harder to learn, as the face is no longer fixed spatially.

latent-pose-descriptors's People

Contributors

cclauss avatar jkvt2 avatar nwatab avatar vincent-thevenin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

jinwonkim93

latent-pose-descriptors's Issues

IndexError: index -1 is out of bounds for axis 0 with size 0

After configuring the project as suggested in step 0. modify params/params.py as required, when I am running the following command

python train.py

then I am getting the following exception

Traceback (most recent call last):
  File "/home/nitin/Latent-Pose-Descriptors/train.py", line 26, in <module>
    dataLoader = DataLoader(dataset, batch_size=batch_size, shuffle=True,
  File "/home/nitin/anaconda3/envs/latent_env/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 266, in __init__
    sampler = RandomSampler(dataset, generator=generator)  # type: ignore
  File "/home/nitin/anaconda3/envs/latent_env/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 102, in __init__
    if not isinstance(self.num_samples, int) or self.num_samples <= 0:
  File "/home/nitin/anaconda3/envs/latent_env/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 110, in num_samples
    return len(self.data_source)
  File "/home/nitin/Latent-Pose-Descriptors/dataset/dataset_class.py", line 84, in __len__
    return self.vid_num[-1]
IndexError: index -1 is out of bounds for axis 0 with size 0

Please help me in solving this.

Cannot resume checkpoints when using different batch sizes

I trained on a server with batch size 12 and tried to resume the checkpoint on my machine with batch size 6 but I got the following error at this line. Can you update the code to be independent of batch size?

Traceback (most recent call last):
  File "train.py", line 208, in <module>
    optimizer.step()
  File "/home/ken/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/ken/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/optim/adam.py", line 99, in step
    exp_avg.mul_(beta1).add_(grad, alpha=1 - beta1)
RuntimeError: The size of tensor a (12) must match the size of tensor b (6) at non-singleton dimension 0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.