evolving-ai-lab / ppgn Goto Github PK

Code for paper "Plug and Play Generative Networks"

License: MIT License

Shell 38.39% Python 61.61%

ppgn's Introduction

Plug and Play Generative Networks

This repository contains source code necessary to reproduce some of the main results in the paper:

Nguyen A, Clune J, Bengio Y, Dosovitskiy A, Yosinski J. (2017). "Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space". Computer Vision and Pattern Recognition.

If you use this software in an academic article, please consider citing:

@inproceedings{nguyen2017plug,
  title={Plug \& Play Generative Networks: Conditional Iterative Generation of Images in Latent Space},
  author={Nguyen, Anh and Clune, Jeff and Bengio, Yoshua and Dosovitskiy, Alexey and Yosinski, Jason},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2017},
  organization={IEEE}
}

For more information regarding the paper, please visit www.evolvingai.org/ppgn

1. Setup

Installing software

This code is built on top of Caffe. You'll need to install the following:

Install Caffe; follow the official installation instructions.
Build the Python bindings for Caffe
If you want to try example 5 (image captioning), you would need to use the Caffe provided here instead
You can optionally build Caffe with the GPU option to make it run faster (recommended)
Make sure the path to your caffe/python folder in settings.py is correct
Install ImageMagick command-line interface on your system (for post-processing the images)

Downloading models

You will need to download a few models to run the examples below. There are download.sh scripts provided for your convenience.

The generator network (Noiseless Joint PPGN-h) can be downloaded via: cd nets/generator/noiseless && ./download.sh
The encoder network (here BVLC reference CaffeNet): cd nets/caffenet && ./download.sh
For example 4, download AlexNet CNN trained on MIT Places dataset: cd nets/placesCNN && ./download.sh
For example 5, download LRCN image captioning model: cd nets/lrcn && ./download.sh

Settings:

Paths to the downloaded models are in settings.py. They are relative and should work if the download.sh scripts run correctly.

2. Usage

The main sampling algorithm is in sampler.py. We provide two Python scripts for sampling conditioned on classes and sampling conditioned on captions to which you can pass various command-line arguments to run different experiments. The basic idea is to sample from the joint model p(x,y) which decomposes into a prior p(x) model (given by the G and E) and a condition p(y|x) model. Here, we provide the pre-trained networks for the Noiseless Joint PPGN-h model (Sec 3.5 in the paper). We show examples conditioning on classes, hidden neurons, and captions by using different condition networks.

Examples

We provide here 5 different examples as a starting point. Feel free to fork away to produce even cooler results!

1_class_conditional_sampling.sh: Sampling conditioning on the class "junco" (output unit #13 of the CaffeNet DNN trained on ImageNet dataset). This script produces a sampling chain for a single given class.

Running ./1_class_conditional_sampling.sh 13 produces this result:

A sampling chain conditioning on class "junco" starting from a random code (top left)

2_class_conditional_sampling_many.sh: We can also run a long sampling chain between different classes.

Running ./2_class_conditional_sampling_many.sh <epsilon1> with different epsilon1 (multiplier for the image prior component) produces a chain with different styles of samples:

1e-5	1e-3	1e-1

Default	More abstract style	Ignoring class gradient

3_hidden_conditional_sampling.sh: Instead of conditioning on a class, it is possible to condition on a hidden neuron i.e. performing Multifaceted Feature Visualization or synthesizing a set of inputs that highly activate a given neuron to understand what features it has learned to detect.

Running ./3_hidden_conditional_sampling.sh 196 produces a set of images for a conv5 neuron #196 previously identified as a "face detector" in DeepVis toolbox:

30 samples generated by conditioning on a "face detector" conv5 neuron. It is interesting that the face detector neuron even fires for things that do not look like a face at all (e.g. the yellow house in the center)

Running the above longer could can produce many other types of faces.

4_hidden_conditional_sampling_placesCNN.sh: One can repeat the example above but with an arbitrary neuron in a different condition network. Here, we visualize the conv5 neuron #182 in the AlexNet DNN trained on MIT Places205 dataset. This neuron has been previously identified as a "food detector" in Zhou et al [2].

Running ./4_hidden_conditional_sampling_placesCNN.sh 182 produces this result:

30 random samples that highly activate a "food detector" conv5 neuron.

5_caption_conditional_sampling.sh: We can also replace the image classifier network in previous examples with a pre-trained image captioning network to form a text-to-image model without even re-training anything. The image captioning model in this example is the LRCN model in Donahue et al (2015) [1].

You would need to use the Caffe provided here and update the path to Caffe accordingly in settings.py
The list of words supported are here
Running ./5_caption_conditional_sampling.sh a_church_steeple_that_has_a_clock_on_it produces this result:

Note that we often obtain mixed results with this particular text-to-image model. For some words, it works pretty well, but for others it struggles to produce reasonable images. While the language space in this model still needs further exploration, as a starting point, here are some sentences that produce reasonable images.

6_class_conditional_sampling_from_real_image.sh: One can also initialize the sampling from a real image (here, images/volcano.jpg).

Running ./6_class_conditional_sampling_from_real_image.sh 980 produces this result:

7_inpainting.sh: One can also perform "inpainting" i.e. predicting the missing pixels given the observed ones.

Running ./7_inpainting.sh produces this result:

In each pair, the left is a real image with a random 100x100 patch masked out. The right is the result of PPGNs filling in the patch.

Using your own condition models

If using your own condition network, you should search for the parameters that produces the best images for your model (epsilon1, epsilon2, epsilon3 or learning rates). One simple way to do this is sweeping across different parameters.
Note that this script should work with Tensorflow models as well because they all talk Python

3. Ideas

Here are a few (crazy?) ideas that one could play with PPGNs:

One can generate objects in a specific region of the image, similar to Learning What and Where to Draw Reed et al. (2016), by conditioning on a region of the last heatmap of a fully convolutional classification network or a semantic segmentation network.
Plugin a better image captioning model e.g. the Show and Tell
Synthesize a Music Video by sampling images conditioned on lyrics.
There are more and crazier ideas to do with PPGNs, feel free to reach out if you want to chat.

4. Licenses

Note that the code in this repository is licensed under MIT License, but, the pre-trained condition models used by the code have their own licenses. Please carefully check them before use.

5. Questions?

If you have questions/suggestions, please feel free to email, tweet to @anh_ng8 or create github issues.

References

[1] Donahue et al. "Long-term Recurrent Convolutional Networks for Visual Recognition and Description". CVPR 2015

[2] Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. "Object detectors emerge in deep scene cnns". ICLR 2015.

ppgn's People

Contributors

Stargazers

Watchers

Forkers

chagge wanjinchang vyraun baiyancheng20 stevenlol davidjesusacu johndpope xuqy1981 etali benjamesbabala yjxiong ieee820 pchank yangxs phzhang leonbai cryan2016 shinexunju ml-lab schangpi jkammerl xuecaihu ruckusist liu4lin tpys albertxiebnu selinache zhangyuancv thommiano jakeelwes arnabgho odegeasslbc collector-m deepmusic forestdengtech rsantana-isg deepanonymous deepbluesea jason9263 philkuz ryanolson zhouqingping yiweichen04 zxc521920 kekedan chrinide liviust donghaoye achraf-oussidi vanpersie32 alexbigboy joolsa paddymahoney choidami iqbal-chowdhury peri044 shiyongde aidasliaudanskas csu-gh morganwang010 codeaudit yutingkevinlai jliangnku shubhampachori12110095 shyamalschandra jglwiz hunslater-deeplearning tracycuiq rt0220 taneta shooter2062424 meteorjj randomthoughts2018 fendaq ruthcfong praveenkumarchandaliya tinyloop clivezheng1 afcarl gaojialing ominux mgsmyth klqulei wgmueller1 wsgharvey slumnitz onlyrightnow xrosliang brucejunlee chirag126 ersks stevenhailin etiennexiong xjc90s channgo2203 kanghanqiu thoklei

ppgn's Issues

How to get larger images from this experiment?

Hi there,

How could i get larger images to be produced from this code? I want something like 1920x1080 ideally. Is there a simple change to the code that could produce this?

Thanks, Josh.

Embed layer error

Hi, I'm trying to run 5th captioning example, but stack with error

WARNING: Logging before InitGoogleLogging() is written to STDERR
F1221 08:09:06.817744 21017 embed_layer.cu:61] Check failed: !propagate_down[0] Can't backpropagate to EmbedLayer input.
*** Check failure stack trace: ***
./5_caption_conditional_sampling.sh: line 52: 21017 Aborted (core dumped) python ./sampling_caption.py --act_layer ${act_layer} --opt_layer ${opt_layer} --sentence ${sentence} --xy ${xy} --n_iters ${n_iters} --save_every ${save_every} --reset_every ${reset_every} --lr ${lr} --lr_end ${lr_end} --seed ${seed} --output_dir ${output_dir} --init_file ${init_file} --epsilon1 ${epsilon1} --epsilon2 ${epsilon2} --epsilon3 ${epsilon3} --threshold ${threshold} --net_weights ${net_weights} --net_definition ${net_definition} --captioner_definition ${captioner_definition}

I'm using master version from caffe repo instead of Yours, because forked branch older and so I have compatibility problems with cuDNN , but I check that there is no differences in both Embed_layer files. So please can You help me with this problem, or prompt what layers or files do I need from Your caffe branch to make script work.

Can't backpropagate to EmbedLayer input

I am trying to run 5_caption_conditional_sampling.sh, but it gives this error
F0311 20:36:36.194001 12299 embed_layer.cu:61] Check failed: !propagate_down[0] Can't backpropagate to EmbedLayer input.
I am using cuda8 and cudnn, and the lastes version of Caffe from git which already includes RNN, LSTM layers in master branch
See https://github.com/anguyen8/caffe_lrcn is a clone of http://jeffdonahue.com/lrcn/ which has send pull request and already merged see:BVLC/caffe#3948

I even tried to change lrcn_word_to_preds.deploy.prototxt file by adding
propagate_down: true
to
layer { name: "embedding" type: "Embed" propagate_down: true bottom: "input_sentence" top: "embedded_input_sentence" embed_param { input_dim: 8801 num_output: 1000 bias_term: false } }

Embed layer
But same error persists
Where am i doing wrong, am i just on the wrong track?

I can't found "generator.caffemodel"in your share

Error with Ex. 5 Image Captioning

@anguyen8 I'm trying to run the fifth example, but I encounter the following error:

root@8c4e9b11f13b:~/ppgn# ./5_caption_conditional_sampling.sh a_church_steeple_that_has_a_clock_on_it
libdc1394 error: Failed to initialize libdc1394
-------------
 sentence: a_church_steeple_that_has_a_clock_on_it
 n_iters: 200
 reset_every: 0
 save_every: 0
 threshold: 0.0
 epsilon1: 0.001
 epsilon2: 1.0
 epsilon3: 1e-17
 start learning rate: 1.0
 end learning rate: 1e-10
 seed: 0
 opt_layer: fc6
 act_layer: fc8
 init_file: None
-------------
 output dir: output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it
 net weights: nets/lrcn/lrcn_caffenet_iter_110000.caffemodel
 net definition: nets/caffenet/caffenet.prototxt
 captioner definition: nets/lrcn/lrcn_word_to_preds.deploy.prototxt
-------------
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0123 16:11:34.156642  3730 embed_layer.cu:61] Check failed: !propagate_down[0] Can't backpropagate to EmbedLayer input.
*** Check failure stack trace: ***
./5_caption_conditional_sampling.sh: line 52:  3730 Aborted                 (core dumped) python ./sampling_caption.py --act_layer ${act_layer} --opt_layer ${opt_layer} --sentence ${sentence} --xy ${xy} --n_iters ${n_iters} --save_every ${save_every} --reset_every ${reset_every} --lr ${lr} --lr_end ${lr_end} --seed ${seed} --output_dir ${output_dir} --init_file ${init_file} --epsilon1 ${epsilon1} --epsilon2 ${epsilon2} --epsilon3 ${epsilon3} --threshold ${threshold} --net_weights ${net_weights} --net_definition ${net_definition} --captioner_definition ${captioner_definition}
libdc1394 error: Failed to initialize libdc1394

Which is followed by:

WARNING: Logging before InitGoogleLogging() is written to STDERR
F0123 16:11:38.973918  3734 embed_layer.cu:61] Check failed: !propagate_down[0] Can't backpropagate to EmbedLayer input.
*** Check failure stack trace: ***
./5_caption_conditional_sampling.sh: line 52:  3734 Aborted                 (core dumped) python ./sampling_caption.py --act_layer ${act_layer} --opt_layer ${opt_layer} --sentence ${sentence} --xy ${xy} --n_iters ${n_iters} --save_every ${save_every} --reset_every ${reset_every} --lr ${lr} --lr_end ${lr_end} --seed ${seed} --output_dir ${output_dir} --init_file ${init_file} --epsilon1 ${epsilon1} --epsilon2 ${epsilon2} --epsilon3 ${epsilon3} --threshold ${threshold} --net_weights ${net_weights} --net_definition ${net_definition} --captioner_definition ${captioner_definition}
libdc1394 error: Failed to initialize libdc1394

And then:

WARNING: Logging before InitGoogleLogging() is written to STDERR
F0123 16:11:43.888489  3738 embed_layer.cu:61] Check failed: !propagate_down[0] Can't backpropagate to EmbedLayer input.
*** Check failure stack trace: ***
./5_caption_conditional_sampling.sh: line 52:  3738 Aborted                 (core dumped) python ./sampling_caption.py --act_layer ${act_layer} --opt_layer ${opt_layer} --sentence ${sentence} --xy ${xy} --n_iters ${n_iters} --save_every ${save_every} --reset_every ${reset_every} --lr ${lr} --lr_end ${lr_end} --seed ${seed} --output_dir ${output_dir} --init_file ${init_file} --epsilon1 ${epsilon1} --epsilon2 ${epsilon2} --epsilon3 ${epsilon3} --threshold ${threshold} --net_weights ${net_weights} --net_definition ${net_definition} --captioner_definition ${captioner_definition}
montage.im6: unable to open image `output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it/fc8_*.jpg': No such file or directory @ error/blob.c/OpenBlob/2641.
montage.im6: missing an image filename `output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it/a_church_steeple_that_has_a_clock_on_it.jpg' @ error/montage.c/MontageImageCommand/1790.
convert.im6: unable to open image `output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it/a_church_steeple_that_has_a_clock_on_it.jpg': No such file or directory @ error/blob.c/OpenBlob/2641.
convert.im6: no images defined `output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it/a_church_steeple_that_has_a_clock_on_it.jpg' @ error/convert.c/ConvertImageCommand/3044.
convert.im6: unable to open image `output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it/a_church_steeple_that_has_a_clock_on_it.jpg': No such file or directory @ error/blob.c/OpenBlob/2641.
convert.im6: no images defined `output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it/a_church_steeple_that_has_a_clock_on_it.jpg' @ error/convert.c/ConvertImageCommand/3044.
/root/ppgn/output/fc8_eps1_1e-3_eps3_1e-17/a_church_steeple_that_has_a_clock_on_it/a_church_steeple_that_has_a_clock_on_it.jpg

I cloned the the caffe_lrcn from the source you linked, and I updated the settings.py file accordingly: caffe_root = "~/caffe_lrcn/python" where I have my tree structured as the following:

drwxr-xr-x 20 root root 4096 Jan 23 16:05 caffe
drwxr-xr-x 13 root root 4096 Jan 23 15:33 caffe_lrcn
drwxr-xr-x 13 root root 4096 Jan 23 16:17 ppgn

I don't have any problems running the other four examples. Also, I'm getting the Failed to initialize libdc1394 error because I'm running this in a Docker container, and others have reported a similar problem, though it usually doesn't appear to cause any problems.

Any thoughts? Thanks!

Errors in running first example code

When I run sh 1_class_conditional_sampling.sh 13 I get the following error.

WARNING: Logging before InitGoogleLogging() is written to STDERR
F0105 05:49:58.930766   529 common.cpp:157] Check failed: error == cudaSuccess (10 vs. 0)  invalid device ordinal
*** Check failure stack trace: ***
Aborted (core dumped)
ls: cannot access output/fc8_chain_13_eps1_1e-5_eps3_1e-17/samples/*.jpg: No such file or directory
montage.im6: missing an image filename `output/fc8_chain_13_eps1_1e-5_eps3_1e-17/chain_13_hx_1e-5_noise_1e-17__{0..0}.jpg' @ error/montage.c/MontageImageCommand/1790.
/root/caffe/ppgn/output/fc8_chain_13_eps1_1e-5_eps3_1e-17/chain_13_hx_1e-5_noise_1e-17__{0..0}.jpg

I believe I've properly set the path for caffe/python in the settings.py file, but changing the path makes no impact as I continue to get the same error.

I'm using 8.0, V8.0.44 and Driver Version: 375.26, which should be compatible.

When I change the settings.py to gpu = False, I get the following error:

usage: sampling_class.py [-h] [--units units] [--n_iters iter]
                         [--threshold [w]] [--save_every save_iter]
                         [--reset_every reset_iter] [--lr [lr]]
                         [--lr_end [lr]] [--epsilon2 [lr]] [--epsilon1 [lr]]
                         [--epsilon3 [lr]] [--seed [n]] [--xy [n]]
                         [--opt_layer s] [--act_layer s] [--init_file s]
                         [--write_labels] [--output_dir b] [--net_weights b]
                         [--net_definition b]
sampling_class.py: error: argument --seed: invalid int value: '{0..0}'
ls: cannot access output/fc8_chain_13_eps1_1e-5_eps3_1e-17/samples/*.jpg: No such file or directory
montage.im6: missing an image filename `output/fc8_chain_13_eps1_1e-5_eps3_1e-17/chain_13_hx_1e-5_noise_1e-17__{0..0}.jpg' @ error/montage.c/MontageImageCommand/1790.
/root/caffe/ppgn/output/fc8_chain_13_eps1_1e-5_eps3_1e-17/chain_13_hx_1e-5_noise_1e-17__{0..0}.jpg

Looking through this directory, I found that the samples folder is empty: output/fc8_chain_13_eps1_1e-5_eps3_1e-17/samples/

Any thoughts? Any tips for finding the failure stack trace? Thanks!

'get_code' is not defined - when using init_file

Hey @anguyen8

I'm trying to start the generation from an image (this is the function of the init_file right?), it all runs fine with 'None' for the init_file path

but with a path for init_file I'm getting this error:
(the jpg file's path and dimensions are definately correct)

Traceback (most recent call last):
  File "./sampling_class.py", line 214, in <module>
    main()
  File "./sampling_class.py", line 166, in main
    start_code, start_image = get_code(encoder=encoder, path=args.init_file, layer=args.opt_layer)
NameError: global name 'get_code' is not defined

where is the training code?

i see there only test code ,training config can open?

Why do you select caffe to build the model?

Can you explain it for special reason?

PPGN Dockerfile

I made a Dockerfile of your PPGN implementation, for GPU use with CUDA 8. Source code available on github: https://github.com/thommiano/ppgn-docker/blob/master/Dockerfile

and it's also available on Dockerhub:
https://hub.docker.com/r/socraticdatum/ppgn/

Not getting good results while using my own Condition Network

I'm trying to use my own condition network and visualize some neurons in conv5_2 layer. The network had different layer names so I changed self.fc_layers and self.conv_layers in sampling_class.py. I also changed 3_hidden_conditional_sampling.sh accordingly as well.

I tried sweeping across epsilon1, epsilon2, epsilon3 and learning rates parameters and performed 5000 iterations, but network fails to generate good images with high output probabilities.

I'm not sure if I am sweeping across right parameters, some of the parameters that I have tried are:

lr=(0.0005 0.005 0.05 1) 
epsilon1=(5 1 1e-1 1e-3 1e-7 1e-11 1e-15)
epsilon2=(0.00001 0.0001 0.05 0.5 1 2)
epsilon3=(5 1 1e-1 1e-3 1e-7 1e-11)

Is there anything else I need to change to get it working for my own condition network or How would you recommend to proceed in this case?

I can't found"generator.caffemodel" in your share

when I used " cd nets/generator/noiseless && ./download.sh" it said "The requested URL /~anguyen8/share/generator.caffemodel was not found on this server." I don't know why?

AttributeError: 'NoneType' object has no attribute 'split'

Hi @anguyen8

Thanks for sharing the code. I'm trying to implement the first example you provide, however when I run "python sampling_class.py", I got this error:

Traceback (most recent call last):
File "sampling_class.py", line 272, in
main()
File "sampling_class.py", line 230, in main
conditions = [ { "unit": int(u), "xy": args.xy } for u in args.units.split("_") ]
AttributeError: 'NoneType' object has no attribute 'split'

Any idea of what's going on?