Coder Social home page Coder Social logo

stylegan_meta's Introduction

Introduction

Few-shot learning has been studied in the context of classification and achieved great success. In the meanwhile, GAN also had many breakthroughs with the proposal of StyleGAN etc. Here we are interested in Few-shot generation, specifically with GAN, so given a small amount of data in some class, which the model has never seen before, we want it to generate images from the same class.

Related Work

We believe Style-Based Generator is the state of the art, so we build our model on top of it. One interesting paper about it is Image2StyleGAN by Rameen Abdal et al., they found that StyleGAN-ffhq is so expressive that it is potentially capable of generating any images, although only human faces' embedding is meaningful (i.e. interpolation is smooth and continuous).

Another work that's more related to our idea is Image Generation from Small Datasets via Batch Statistics Adaptation, the model can generate anime or human faces given as few as 25 images. The core idea of this paper is that convolution can be seen as a combination of filters, by changing scale and shift we can control filter selection. Other people have also done interesting work such as Few-Shot Unsupervised Image-to-Image Translation and FIGR etc..

Our Approach

Inspired by the work above we propose an algorithm that makes embedding of any class meaningful by updating the AdaIn mappings while enforce the interpolation constraints explicitly. AdaIn is defined as follows:

We choose to only update AdaIn mappings because it changes scale and shift, and thus controls style as it presents as filter selection. Let A_l(w) be the l-th Adain mapping, i.e. the mapping from w to the batch normalization statistics for layer l of the style-based generator network, and suppose we are given a set of images (x_1, ..., x_k) in a class. We move AdaIn mappings by finding a set of z_i and A_l such that

where 0 <= \alpha, \beta <= 1, \alpha + \beta = 1 is minimized. To enforce the third term L' i.e. the interpolation loss, there are a few ways we can do this.

  1. Train a k-shot discriminator that checks the interpolation points and makes sure they look like the $k$ points provided for few shot learning.

  2. Enforce the middle image has the content of the leftmost image and the style of the rightmost image

  3. Use the L1 loss scaled by the distance of the interpolated variable from the endpoints. Suppose that . Then one possible loss is

Results

These results are from the implementation of (3) above using 50 training examples. In addition to $L_1$, we added $L_per$.

interpolation results:

alt text alt text alt text alt text alt text alt text

The importance of interpolation loss:

First two rows are interpolations without interpolation loss defined above, last two rows are with: alt text alt text alt text alt text

The importance of updating AdaIn:

This is interpolation result without updating AdaIn parameters, first and last pictures represents reconstructions of two pictures, getting from Style-GAN trained from human face dataset, similar to what Image2StyleGAN has shown, the reconstruction is super good but the embedding of flower picture is not meaningful, actually the model "overfits" to human faces: This is interpolation result from our model, now the interpolation is quite smooth: alt text

Interpolation sampling results (depth one, first two) and random sampling results (following three):

Here, interpolation sampling means image is generated from latent code that is interpolated from training images latent codes. Random sampling means a random latent code from latent space.

Random results for flowers:

Try the Sample Code

python ./train

stylegan_meta's People

Contributors

xqqquxixi avatar

Stargazers

Ellis avatar Ruimin Chu avatar aydao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.