Coder Social home page Coder Social logo

arunasank / alterate_affection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from puzer/stylegan-encoder

0.0 1.0 0.0 45.96 MB

Uses latent spaces of images to change the affect of a particular video

License: Other

Python 0.89% Jupyter Notebook 99.11%

alterate_affection's Introduction

alterate_affection — a repurpose of stylegan-encoder

This repository tries to use Puzer/stylegan-encoder to change the affection of a video.

Setup:

Note: You can see most (if not all) of this instructions here

  1. Using a docker
  2. Go to the folder of your choice and clone this repository and cd into it: git clone https://github.com/ralcant/alterate_affection.git
  3. [Optional — but highly recommended] Working with Python virtual enviroments:
  4. Start you Jupyter Notebook in remote server
  5. In your local computer, create a ssh tunnel
  6. Access the jupyter notebook locally and start working on the general_video_processing notebook! If you are using a virtual enviroment, Make sure to choose it as the kernel of your notebook before running anything.

Working with a video:

Now let the fun begin!

After the setup, we can work with the general_video_processing notebook The notebook should be self explanatory, but in a general sense this is what happens:

  1. Getting everything ready : Here we install the necessary packages, and create any folder we might later need.
  2. Breaking the video into multiple frames : Here we take a video and split it in multiple frames using cv2. We also store the fps (frames per second) of the video, which will be useful later.
  3. Updating every frame : This is the heavy and most time consuming part of the whole notebook. The main goal is to update every frame with the person's emotion changed. This is where the stylegan-encoder code will be most useful. We divide the full work in subsections:
    • 2.1: Getting the aligned images out of every frame : Here we use a modified version of stylegan-encoder's align_images.py code. Here we store, for every frame, the positions of the face in a variable called ALL_ALIGNED_INFO. This will be useful later.
    • 2.2: Generating the latent vectors from the aligned images : This is by far the step that takes the most time (we are talking about hours). It uses stylegan-encoder's encode_images.py. The latent vectors will be useful to change the affect in every frame (see next step)
    • 2.3: Changing the affect of the aligned frames, and use this to change the affect of the original frames: We use the latent vectors from 2.2 and stylegan-encoder's smile_direction to change the emotion of every aligned frame. Then we use the values from ALL_ALIGNED_INFO and some image processing to put that face into our original frame.
  4. Combining the processed frames into a video: We use cv2 for this. The output will be video with no sound of the updated frames. For this to work we use the fps we found in Step #1
  5. Extracting the audio from the original video: We use moviepy for this and we store the mp3 audio of our original video.
  6. Adding the audio to our processed video: Final step! We use moviepy for this too.

Phew! That was quite a lot.

Current limitations

  • Still need to see a way to not harcode the face dimension
  • Adding the original audio does not seem to be a good idea, as now the lips of the transformed frames are not in sync with the sound

You can see the original readme here

alterate_affection's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.