Coder Social home page Coder Social logo

mjdietzx / simgan Goto Github PK

View Code? Open in Web Editor NEW
411.0 23.0 101.0 31 KB

Implementation of Apple's Learning from Simulated and Unsupervised Images through Adversarial Training

License: MIT License

Python 100.00%
gan deep-learning generative-adversarial-network neural-networks tensorflow keras machine-learning simulations simulated-unsupervised-learning

simgan's Introduction

SimGAN

Keras implementation of Apple's Learning from Simulated and Unsupervised Images through Adversarial Training

Running

Install dlutils from https://github.com/wayaai/deep-learning-utils:

$ pip install -U git+https://github.com/wayaai/deep-learning-utils.git

or

$ git clone https://github.com/wayaai/deep-learning-utils.git
$ python setup.py install develop

python3 sim-gan.py PATH_TO_SYNTHESEYES_DATASET PATH_TO_MPII_GAZE_DATASET

In apple's paper they use Unity Eyes to generate ~1.2 million synthetic images. I am on mac though so I just used the easily available SynthesEyes Dataset. This is small (only around ~11,000 images) so it would be much better if someone could generate a larger dataset w/ Unity Eyes and share it on s3.

The dataset of real image's used in apple's paper is the MPIIGaze Dataset. They use the normalized images provided in this dataset which are stored in matlab files. It was a bit of a pain to get these in an easily usable form so I'm sharing the ready to go datasets on s3.

Ready to go datasets

Details

Implementation of 3.1 Appearance-based Gaze Estimation on UnityEyes and MPIIGaze datasets as described in paper.

  • Currently only Python 3 support.
  • Tensorflow support and maybe PyTorch support in future.

Implementation

This is meant to be a light-weight and clean implementation that is easy to understand - no deep shit. It can also be used as a resource to understand GANs in general and how they can be implemented.

Running Online

You can see a interactive Jupyter Notebook version of this script with training data on Kaggle or just the raw training set

About waya.ai

Waya.ai is a company whose vision is a world where medical conditions are addressed early on, in their infancy. This approach will shift the health-care industry from a constant fire-fight against symptoms to a preventative approach where root causes are addressed and fixed. Our first step to make realize this vision is easy, accurate and available diagnosis. Please get in contact with me if this resonates with you!

simgan's People

Contributors

kmader avatar liquitious avatar mjdietzx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simgan's Issues

Semantic Image Segmentation

Hello,

Can this method be used for semantic image segmentation? I have a dataset of unlabeled real images and a dataset of labeled synthetic images? Will the big resolution of the image be a problem?

Facial Images

I use this code to reconstruct facial images.

But, the result aren't good enought.

One doubt: I've organized my dataset in

data
> synthetic
> [images]
> real
> [images]

Is this the right way to use the code?
Thanks.

A question about k_g parameters?

In the paper, it set kd = 1, kg = 2.

Then, for each update of Dφ , we update Rθ twice

But in the code:
k_g = 2 # number of generative network updates per step
for _ in range(k_g * 2):

I think the 'k_g' should not * 2.

Incompatible with current version of Keras

Hi,
I've found a couple of issues related to use of this code with Keras 2.2.4 in sim-gan.py that cause errors.

  1. Line 72 needs to be changed to y = layers.merge.Add()([input_features, y])
  2. Line 177 needs to be changed to data_format='channels_last')
    Thanks for providing the code, it's very helpful.

Why is the output shape of discriminator is (None, 2)?

Hi everyone. I'm a green hand in GAN.
Seeing the following code:

`def discriminator_network(input_image_tensor):
"""
The discriminator network, Dφ, contains 5 convolution layers and 2 max-pooling layers.

:param input_image_tensor: Input tensor corresponding to an image, either real or refined.
:return: Output tensor that corresponds to the probability of whether an image is real or refined.
"""
x = layers.Convolution2D(96, 3, 3, border_mode='same', subsample=(2, 2), activation='relu')(input_image_tensor)
x = layers.Convolution2D(64, 3, 3, border_mode='same', subsample=(2, 2), activation='relu')(x)
x = layers.MaxPooling2D(pool_size=(3, 3), border_mode='same', strides=(1, 1))(x)
x = layers.Convolution2D(32, 3, 3, border_mode='same', subsample=(1, 1), activation='relu')(x)
x = layers.Convolution2D(32, 1, 1, border_mode='same', subsample=(1, 1), activation='relu')(x)
x = layers.Convolution2D(2, 1, 1, border_mode='same', subsample=(1, 1), activation='relu')(x)

# here one feature map corresponds to `is_real` and the other to `is_refined`,
# and the custom loss function is then `tf.nn.sparse_softmax_cross_entropy_with_logits`
return layers.Reshape((-1, 2))(x)`

How to understand ' here one feature map corresponds to is_real and the other to is_refined'?
Usually, the discriminator output only one feature map indicating the confidence probability of true or false.

I can not understand why keep discriminator_model.trainable=False all the time?

The code in sim-gan.py

    refiner_model.compile(optimizer=sgd, loss=self_regularization_loss)
    discriminator_model.compile(optimizer=sgd, loss=local_adversarial_loss)
    discriminator_model.trainable = False
    combined_model.compile(optimizer=sgd, loss=[self_regularization_loss, local_adversarial_loss])

I think when pre-training the discriminator network, It should be change to True. And when process Algorithm 1, change it alternately.

Maybe I misunderstand it, can you explain it to me? Thanks!!

Pre-trained model?

Maybe it's possible for someone to share a pre-trained model? I would be grateful, cause i tried to train it by myself, but with no luck.

the performace of training

HI~ Grateful for your work!
I run the code, setting the epoch to 6000, but the performance is seemed to be poor. The refined images has no difference with the synthesized images.
Have you ever tried a long training and got a good result?
Thank you.

GPU utilization

Hi,

Even though the net takes almost all of the GPU memory, the "Volatile GPU util" remains at 0. Therefore, I conclude that the net doesn't really run on GPU (the very slow training process also suggests this).
I am running the SimGAN with Tensorflow 1.3 and Keras 2.0.6 (I have changed the Convolution2D to Conv2D, even though the original behaved the same way).
Could you tell me the running configuration that you have used for testing (Tensorflow and Keras versions)?

Thank you!

Bug in image_history_buffer

In "add_to_image_history_buffer" I believe that you have to assign the result of the np.append() to the image buffer array:

self.image_history_buffer = np.append(self.image_history_buffer, images[:nb_to_add], axis=0)

Otherwise the value is not updated. I had to make the change in my implementation to get it to work

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.