Coder Social home page Coder Social logo

lukasuz / stylegan2-landmark-projection Goto Github PK

View Code? Open in Web Editor NEW
20.0 1.0 3.0 7.57 MB

Experimental repository attempting to project facial landmarks into the StyleGAN2 latent space.

License: Other

Dockerfile 0.22% Python 89.68% Shell 0.30% C++ 3.01% Cuda 6.79%
gan stylegan2 stylegan2-ada landmarks projection stylegan face-manipulation face-generation

stylegan2-landmark-projection's Introduction

You can also generate face animations now. Check this blog post to find out how.

Google Colab

StyleGAN2 Facial Landmark Projection

This is an experimental repository with the aim to project facial landmark into the StyleGAN2 latent space. The code is an adaptation from the original StyleGAN2-ADA repository [0]. For projection of facial landmarks, the l2 norm of the landmark heat maps between projection image and target landmark image is minimized, next to the original LPIPS loss [2]. For heat maps of the landmarks, [1] is used. Thus, there are two target images, one for the look and one for the landmarks. The objective becomes (noise regularization omitted):

,

with HL being the heat map loss defined as

,

where N is the number of pixels, and FAN is the landmark heat map extraction model which outputs a three-dimensional matrix, where the depth dimension encodes each single landmark. LPIPS as in [1, 2]. The factor is a vector containing the weights for each group of landmarks. Groups are for example: Eye brows, eyes, mouth, etc. Check [1] for more info. By tweaking this vector you can determine what facial features you want to project more strongly into the generated images. See below for an example.

This repository is work in progress. Happy about input and contributions.

How to use

For quick testing, you can run this repository in Google Colab. Check it out here. Otherwise, install dependencies

pip install click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3 face_alignment

and run it like so:

python projector.py --lpips_weight=1 --landmark_weight=0.05 --device=cuda --num-steps=1000 --outdir=./ --target_look=./look_img.png --target_landmarks=./landmark_img.png --save_video=1  --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

Examples

Almost no weighting of eyes in vector, strong focus on mouth area:

drawing


Here the facial landmarks are weighted mostly uniformly:

Images from FFHQ data set.

Todos:

  • Allow for different landmark weights via command line. This will probably improve the above example.
  • Check out why the style look of the facial expression image leaks into the projection. Apparently the heat maps contain some "style" signal (normalization factor in heat map calculation was the culprit, seems to work quite well now. This also improved quality of generated images immensely)
  • Add face cropping as preprocessing for non ffhq images
  • Face cropping does not match completly ffhq preprocessing, also uniform background are degrading results strongly. Improve face cropping.
  • Add discriminator loss / regularization. Heat maps are some times wrongly extracted resulting in a very wrongly propagated error
  • Attempt to remove landmark information from VGG embedding for lpips calculation.

References

[0]: Karras, Tero, et al. "Training generative adversarial networks with limited data." arXiv preprint arXiv:2006.06676 (2020). Code: https://github.com/NVlabs/stylegan2-ada-pytorch

[1]: Bulat, Adrian, and Georgios Tzimiropoulos. "How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks)." Proceedings of the IEEE International Conference on Computer Vision. 2017. Code: https://github.com/1adrianb/face-alignment

[2]: Zhang, Richard, et al. "The unreasonable effectiveness of deep features as a perceptual metric." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.

License

This repository is mainly based on the orignal StyleGAN2-ADA code, thus the NVIDIA license applies.

Copyright © 2021, NVIDIA Corporation. All rights reserved.

This work is made available under the Nvidia Source Code License.

stylegan2-landmark-projection's People

Contributors

lukasuz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.