Coder Social home page Coder Social logo

style_transfer's Introduction

Style Transfer as Optimal Transport

Please click through images and download to see detail.

                                                

Currently a work in progress.. please stay tuned for updates Let me know if you are interested in more explanation/theory behind this formulation. If there is interest, I will write this up formally.

An algorithm that transfers the distribution of visual characteristics, or style, of one image onto a second image via an Optimal Transport plan. Implemented in Tensorflow.

Algorithm Overview - sections correspond to synthesize.py

  1. get_style_desc: A 'style' image is fed into the vgg network, which maps RGB-pixel values through a series of feature spaces calibrated to provide discriminatory information for an image classification engine. The dimensionality of the feature space grows from 3 colors (red, green, and blue) to 64, 128, 256, and eventually 512 abstract features (visualizations). Here, we conceptualize each pixel at a specific layer as a realization of a random vector with some distribution; for example, if an image is represented as a tensor of shape 100x100x128 (height x width x feature activations) this would be 10,000 realizations (or samples) of a 128 dimension vector. The first two moments (mean vector and covariance matrix) are observed empirically and stored as a representation of the style.*

  2. infer_loss: A 'subject' image is fed into the same network and the first two moments of the activations are calculated in an identical manner. The L2-Wasserstein distance between these parametrized distributions is used as a loss function.

  3. scipy_optimizer or build_image: An optimizer is invoked (either l-BFGS via scipy or Adam native to Tensorflow) to modify the subject image such that the distance between the distributions is minimized.

*Note: Considering only the first two moments of the activations implicitly assumes the activations follow Gaussian distributions.

To do:

Files:

example.ipynb - notebook that demonstrates use case, output in ipynb_example

vgg.py - script that unpacks 'imagenet-vgg-verydeep-19.mat' found here , citation below

synthesize.py - synthesizes an image by transferring a 'style' onto a 'subject' image

problems_gatys_fullEMD.ipynb - illustrates problems with gatys loss function (frobenius norm of difference in Graham matrices) and calculations of computational intractability of full Earth Mover's distance.

Requirements:

Need vgg network weights and biases, found: http://www.vlfeat.org/matconvnet/models/beta16/imagenet-vgg-verydeep-19.mat ((MD5 8ee3263992981a1d26e73b3ca028a123) All testing done with tensorflow 1.3, python 3.5

README work in progress..

Source of vgg.py script:

@misc{athalye2015neuralstyle,
  author = {Anish Athalye},
  title = {Neural Style},
  year = {2015},
  howpublished = {\url{https://github.com/anishathalye/neural-style}},
  note = {commit xxxxxxx}
}

style_transfer's People

Contributors

vincemarron avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.