Coder Social home page Coder Social logo

bruinxiong / structurednoiseinjection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yalharbi/structurednoiseinjection

0.0 1.0 0.0 7.49 MB

TensorFlow implementation of the CVPR2020 paper: Disentangled Image Generation Through Structured Noise Injection

Python 100.00%

structurednoiseinjection's Introduction

Structured Noise Injection

Paper: https://arxiv.org/abs/2004.12411

Video: https://youtu.be/7h-7wso9E0k

A TensorFlow implementation of structured noise injection as described in the paper. We adapt the original StyleGAN architecture code from https://github.com/NVlabs/stylegan.

The code allows:

  • Disentangled editing of generated images (local features, mid-scale features, pose, and overall style)
  • Training a model with structured noise injection on any dataset
  • Modifying the paper's choices of grid dimensions, local code length, shared code length, and global code length

Examining a pretrained network

We follow the same approach as the original StyleGAN code.

First, download the pretrained network from: https://drive.google.com/file/d/1jxzRnLX2OhPos4E1pqz-7ed4mqVyLwoQ/view?usp=sharing and place it in the same folder as pretrained_SNI.py

In order to randomly generate a few images, and preview the changes possible by our method:

python3 pretrained_SNI.py

This will generate two unique faces, and multiple figure showing specific modifications while maintaining the face identity. Any cell of the noise grid can be changed individually by providing an 8x8 binary mask to the function randomize_specific_local_codes as demonstrated in the example file.

Changing the globally-shared code entry (affects pose) GlobalCodeExamples

Changing the codes that are shared by region (affects mid-level features such as age and accessories) SharedCodeExamples

Changing all local codes (affects the fine details of the face) localCodeExamples

Changing specific local codes (4x4 cells around the mouth) mouthCodeExamples

Changing specific local codes (3x7 cells covering the top of the head) hairCodeExamples

Training a network from scratch

To run training on the FFHQ datasets with the default settings: python3 train.py

The network can be trained similarly to training the original StyleGAN but with a different generator. The code for our generator is included under training/networks_structurednoiseinjection.py.

Please refer to https://github.com/NVlabs/stylegan for the datasets and code requirements.

Frequently Asked Questions

Do you use specific losses?

No. We use the existing GAN loss from StyleGAN.

How do you enforce disentanglement?

We use independent codes and independent mapping parameters per location (4x4 or 8x8). This enforces that each (x,y) location at the input tensor is influenced by only a single local code (plus the shared and global code). Other than that, we believe the achieved disentanglement is due to our selection of two codes: one that is more suitable for encoding spatial details, and one that is more suitable for encoding stylisitic information.

How does your method change attributes such as smile and eyeglasses without labels?

Our method is not supervised, and the network is unaware during training of semantic labels such as smile and eyeglasses. However, disentangled editing is possible because we focus on locations instead of attributes. Due to our structure, the only noise entries that will affect the mouth are the noise cells aligned around the mouth. After training, the user can resample only the noise codes around the mouth to easily change the shape of the mouth to find codes that will add\remove smile. By focusing on 'locking down' details of the face at different places of the noise codes, we enable disentangled editing without labels.

Testing new settings of structured noise injection

Changing cell resolution / changing when to beging style modulation

This can be done in the synthesis part. In order to change where to begin style modulation the layer_epilogue function can be edited. Please note that each resolution contains two style modulation layers. It is difficult to change cell resolution above 8x8 currently since it loses the benefits of progressive growing and lower resolution information. By default, the 8x8 resolution is used as in the paper.

Changing global/shared/local code length

This can be done in the mapping function. The user can feed random codes that are arranged in a certain way, but the user must specify how to assemble the final code for each grid cell given the random codes. If the code lengths or arrangement are changed, the my_randoms in training/misc.py should be updated to chech the changes during training.

structurednoiseinjection's People

Contributors

yalharbi avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.