newmu / dcgan_code Goto Github PK

Deep Convolutional Generative Adversarial Networks

License: MIT License

Python 100.00%

dcgan_code's Issues

Specify a license for the code

Awesome project!!

It would be great if you could specify a license for your code by adding a LICENSE file to the root of the git repo. If you're unfamiliar with source code licensing, check out http://choosealicense.com/

(shameless plug -- I'm a fan of the "GPL, version 2 or later" license because, in the terms of http://choosealicense.com/ I "care about sharing improvements".)

binary data?

what is the best way to make this work effectively on binary data?
i'm assuming the network won't learn to output binary on its own, even if all the inputs are binary. trivially you can just threshold each output value at 0.5, but is there a better way to do this? i'm hoping to take advantage of having only two states to get away with using a larger input vector.

How can I add my picture in Arithmetic on faces in part three

A small question regarding conv_cond_concat

Hi, recently I've been studying your code, especially on the conditional DCGAN you made for MNIST dataset.

I see that you concatenated the condition on every layer right after BatchNorm and ReLu, but I still get puzzled with the conv_cond_concat function that you use to concat the condition into hidden layer. On some layer, you simply use T.concatenate to join them, but on the other layer, you join them using conv_cond_concat function as described below

def conv_cond_concat(x, y):
    """ 
    concatenate conditioning vector on feature map axis 
    """
    return T.concatenate([x, y*T.ones((x.shape[0], y.shape[1], x.shape[2], x.shape[3]))], axis=1)

My questions are,

why using this function, instead of simple T.concatenate?
judging from reshaping of y, I assume you are depth-concatenating it. Am I correct?

CudaNdarrayType only supports dtype float32 for now

In the mnist training code, i get the following traceback. looks like a CUDA versioning issue? do i need to upgrade CUDA or is there some way around this?

gX = gen(Z, Y, *gen_params)

Traceback (most recent call last):
File "", line 1, in
File "", line 9, in gen
File "lib/ops.py", line 90, in deconv
img = gpu_contiguous(X)
File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 509, in call
node = self.make_node(_inputs, *_kwargs)
File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/basic_ops.py", line 3806, in make_node
input = as_cuda_ndarray_variable(input)
File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/basic_ops.py", line 47, in as_cuda_ndarray_variable
return gpu_from_host(tensor_x)
File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 509, in call
node = self.make_node(_inputs, *_kwargs)
File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/basic_ops.py", line 139, in make_node
dtype=x.dtype)()])
File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/type.py", line 70, in init
(self.class.name, dtype, name))
TypeError: CudaNdarrayType only supports dtype float32 for now. Tried using dtype float64 for variable None

how to get paper's dataset?

I can't find the code to download the data.
I need help . thanks

how to deal with overfitting?

It would be nice if the README will contain some tips on how to detect and avoid overfitting.
I'm running the system and I believe I am overfitting.

So here are my tips which need editing before I put them in any README because I'm sure I dont understand the entire picture:

Currently I'm running on 50K examples so perhaps this is the main cause of the problem.
Also the output from running looks like the dump below. If I understand correctly the last two numbers
are supposed to both fall as the iterations progress but as you can see they just wounder around. Maybe this is another sign of overfitting. I'm guessing that the first 3 numbers are distance to nearst-neighbours on different sample sizes. But is 55-53 a low or high number? If it is a low number then this is yet another indication of overfitting.

0 55.06 53.97 53.16 4.0025 0.2856
1 58.91 57.44 56.40 4.7697 0.7055
2 57.16 54.62 52.88 1.7402 0.4073
3 58.61 55.97 54.14 2.1908 0.3683
4 57.77 54.55 52.74 2.6172 0.3062
5 53.55 51.07 49.17 3.7945 0.1111
6 56.28 52.57 50.30 5.4140 0.1525
7 57.81 53.84 51.49 5.6486 0.1883
8 56.94 53.39 51.10 6.3922 0.0688
9 59.04 55.49 53.08 3.5038 0.3072
10 55.08 51.79 49.73 4.6309 0.0904
11 56.03 52.80 50.83 3.6019 0.1094
12 55.67 52.30 50.22 5.2213 0.3286
13 55.65 52.55 50.65 4.2390 0.2232

Use fuel indexing [c,w,h] and not assume it is [w,h,c]

this will allow usage of fuel built in data-sets

#10 (comment)

requirements.txt for installing deps?

HI, new to python so I'm just poking around and the initial setup could use some help in the form of a requirements.txt file. Would love a pip freeze > requirements.txt if you've got a virtual env with the right deps.

Awesome project.

Training input dimensions

I'm trying to create my own dataset. What's the order of the dimensions of the input tensor? Is it (batch_size, channels, height, width), (batch_size, height, width, channels), (batch_size, height * width * channels, 1), or something different? I'm having a lot of trouble creating the hdf5 file. Anyone else have any luck?

What does "GpuDnnConvGradI" do in deconv?

Hello, thank you very much for making public this great project.
I was going through your code, and ran into a point that was not quiet clear to me.
in "deconv" functino in lib/ops.py, line 92 and 95, you put the part that calculates gradient wrt input. I checked the counterpart of Torch version, and it was implemented using regular convolution layer there.

Why is this GpuDnnConvGradI used?
Thank you again for this great source code.

-Taeksoo

Album cover art dataset

Would be a great help if you could link the album cover art dataset you used to train the network on. Thanks very much

ImageNet pretrained model layer dimensions

I'm studying this code and would like to know if my understanding is correct.

Generator:

layer	gifn	gain_ifn	bias_ifn
1	(100, 128 * 8 * 4 * 4)	128 * 8 * 4 * 4	128 * 8 * 4 * 4
2	(1024, 512, 5, 5)	512	512
3	(512, 256, 5, 5)	256	256
4	(256, 128, 5, 5)	128	128
5	(128, 64, 5, 5)	64	64
6	(64, 32, 5, 5)	32	32
output	(32, 3, 5, 5)	----	----

Discriminator:

layer	gifn	gain_ifn	bias_ifn
1	(32, 3, 5, 5)	----	----
2	(64, 32, 5, 5)	64	64
3	(128, 64, 5, 5)	128	128
4	(256, 128, 5, 5)	256	256
5	(512, 256, 5, 5)	512	512
6	(1024, 512, 5, 5)	1024	1024
output	(128 * 8 * 4 * 4, 1)	----	----

Are these dimensions correct for the ImageNet model?

There is no source code.

Bug: There is no source code for the model.

Expected Result: For source code to be released, since the project is titled dgcan_code :)

How to reproduce:

michael@halifax ~> git clone https://github.com/Newmu/dcgan_code
Cloning into 'dcgan_code'...
remote: Counting objects: 46, done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 46 (delta 5), reused 45 (delta 4), pack-reused 0
Unpacking objects: 100% (46/46), done.
Checking connectivity... done.
michael@halifax ~> cd dcgan_code/
michael@halifax ~/dcgan_code> find . -iname '*.lua' | wc -l
       0

Is there a schedule to release any code? Many thanks.

How to train it on my own datasets?

It's a amazing work.
Could you tell me how to train it on my own datasets?

dead zone near origin of latent space

In looking at manifolds in my own experiments, I have noticed a consistent "dead zone" near the origin of the latent space. Here is an example generated with faces/train_uncond_dcgan.py and z=100:

I can post the math later, but suffice to say that the area near the center of the image is proportionally near zero in all z dimensions.

My strong suspicion is that this could be replicated by replacing this line in train_uncond_dcgan.py:

sample_zmb = floatX(np_rng.uniform(-1., 1., size=(nvis, nz)))

with

sample_zmb = floatX(np_rng.uniform(-0.1., 0.1., size=(nvis, nz)))

and seeing if this results in poor quality samples. I can followup and try this if that is useful - I haven't done so yet because I need to first implement the load operation to use one of the models that is being saved each epoch.

This isn't causing me any consternation, but I thought I would mention it since it's an unexpected curiosity and so might be a bug or might just be something I don't understand about the nature of this latent space.

Saved model for faces

Would it be possible to upload the parameters for the model trained on faces?
In the same manner that the parameters for imagenet and svhn have been uploaded?

Thanks!

ImportError: cannot import name gpu_alloc_empty

i'm getting an import error on the gpu_alloc_empty in the following:

from theano.sandbox.cuda.basic_ops import (as_cuda_ndarray_variable,
                                           host_from_gpu,
                                           gpu_contiguous, HostFromGpu,
                                           gpu_alloc_empty)

the other modules by themselves work fine, just gpu_alloc_empty fails. i have cudnn installed and just reinstalled Theano with pip.

How to use it with cpu

i don't have gpu, how can i use it with cpu. And i also want to know which part of your code is about "discriminator as a pre-trained net for CIFAR-10 classification ". Thanks

CLASSIFYING CIFAR-10 USING GANS AS A FEATURE EXTRACTOR

@Newmu can you help me with sample code how to make this part?. I ask about the features of size 28672 how can i get this size and how to get features from it to every image.
Thanks in advance.

ValueError: total size of new array must be unchanged - Using Mnist Dataset

The code from the load.py is generating error of "total size of new array must be unchanged". It just loads the mnist dataset to an array and then reshaping it. The error occurs while reshaping it (in 3rd line) and it is shown below:

     fd = open('C:\\Users\\***\\Desktop\\MNIST Dataset\\train-images.idx3-ubyte.gz')
     loaded = np.fromfile(file=fd,dtype=np.uint8)
---> trX = loaded[16:].reshape((60000,28*28)).astype(float)

     ValueError: total size of new array must be unchanged

I know what the function of reshape is doing. I just didn't figure it out that how to resolve this error. I have tried many things but didn't work in my favour. Can anyone suggest me any solution?

Subset instances cannot be defined by a slice

Hello guys!
I got this error when I run the faces file train_uncond_dcgan.py
and this is it's Traceback.....
"
Traceback (most recent call last):
File "train_uncond_dcgan.py", line 52, in
tr_data, te_data, tr_stream, val_stream, te_stream = faces(ntrain=ntrain)
File "/home/jerry/dcgan_code/faces/load.py", line 14, in faces
te_data = H5PYDataset(path, which_sets=('test',))
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/datasets/hdf5.py", line 197, in init
self.num_examples)
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/datasets/hdf5.py", line 504, in num_examples
return self.subsets[0].num_examples
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/utils/init.py", line 441, in lazy_property_getter
self.load()
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/datasets/hdf5.py", line 465, in load
subsets = self.get_subsets(handle, self.which_sets, self.sources)
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/datasets/hdf5.py", line 448, in get_subsets
slice(row['start'], row['stop']), len(h5file[source]))
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/utils/init.py", line 53, in init
self._subset_sanity_check(list_or_slice, original_num_examples)
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/utils/init.py", line 313, in _subset_sanity_check
self._slice_subset_sanity_check(list_or_slice, num_examples)
File "/home/jerry/anaconda2/lib/python2.7/site-packages/fuel-0.2.0-py2.7-linux-x86_64.egg/fuel/utils/init.py", line 338, in _slice_subset_sanity_check
raise ValueError('Subset instances cannot be defined by a slice '
ValueError: Subset instances cannot be defined by a slice whose start value is greater than or equal to the original number of examples
"

I don't understand what kind of this problem....
Can anybody help me to solve it? PLEASE!!!

My hdf5 file is built by 128 pics 64643 faces jpeg photos......

using GAN and deconv with "valid" border mode

Hi, after playing with your code and getting very good results with it as it is, I am now looking to try other architectures and modify it. I want to try a model where there is no 0 padding and border_mode="valid" , instead of the current (2,2)

I have modified the deconv function from lib/ops.py to have the proper dimensions for the valid border mode, now called deconvV :

def deconvV(X, w, subsample=(1, 1), border_mode=(0, 0), conv_mode='conv'):
img = gpu_contiguous(X)
kerns = gpu_contiguous(w)
desc = GpuDnnConvDesc(border_mode=border_mode, subsample=subsample,
conv_mode=conv_mode)(gpu_alloc_empty(img.shape[0], kerns.shape[1], (img.shape[2]-1)_subsample[0]+kerns.shape[2], (img.shape[3]-1)_subsample[1]+kerns.shape[3]).shape, kerns.shape)
out = gpu_alloc_empty(img.shape[0], kerns.shape[1], (img.shape[2]-1)_subsample[0]+kerns.shape[2], (img.shape[3]-1)_subsample[1]+kerns.shape[3])
d_img = GpuDnnConvGradI()(kerns, img, out, desc)
return d_img

The code works with this operation, but I wonder whether it is correct. The GpuDnnConvGradI and GpuDnnConvDesc do not give an error even if I give some other values for the sizes, so I can be never sure whether I have a bug there or not.

I have also replaced the respective border mode in both the discriminator and generator, and taken care of all model dimensions so that it works. However, it runs for many iterations, sometimes looks as it learned something, but afterwards it just produces noise. Can it be I have made some error in my implementation of the deconv dimensions?

Or another explanation can be that the thus resulting architecture (without 0 padding) is much more unstable with respect to the collapse of the generator, as described by Tim Salimans in "Improved Techniques for Training GANs"

thanks a lot for your help

a typo?

I see this in the ops.py

def conv_cond_concat(x, y): """ concatenate conditioning vector on feature map axis """ return T.concatenate([x, y*T.ones((x.shape[0], y.shape[1], x.shape[2], x.shape[3]))], axis=1)

does it really mean y.shape[1], and not x.shape[1]?

ValueError: total size of new array must be unchanged error

 fd = open('C:\\Users\\***\\Desktop\\MNIST Dataset\\train-images.idx3-ubyte.gz')
 loaded = np.fromfile(file=fd,dtype=np.uint8)

---> trX = loaded[16:].reshape((60000,28*28)).astype(float)

 ValueError: total size of new array must be unchanged

I know what the function of reshape is doing. I just didn't figure it out that how to resolve this error. I have tried many things but didn't work in my favour. Can anyone suggest me any solution?

"faces" dataset: how to create hdf5 from images?

Given a set of pictures, it's unclear how to create a correctly formatted dataset. I understand if the pictures used cannot be made available, but could you please share whatever script or method you used to create the hdf5 dataset used as an input?

Thanks.

How to use this for image retrieval?

I trained dcgan on my own dataset,
and now i want to use this net for image retrieval
now i have the generator network, that takes a length 100 encoding and transform it to an image
is there a simple way to reverse this process, so that i can give an image and get a length 100 encoding?
i tried to reverse the net like so

def gen_inv(X, w, g, b, w2, g2, b2, w3, g3, b3, w4, g4, b4, wx):
    x = dnn_conv(X, wx, subsample=(2, 2), border_mode=(2, 2))
    x = relu(batchnorm(x, g=g4, b=b4))
    x = dnn_conv(x, w4, subsample=(2, 2), border_mode=(2, 2))

    h3 = relu(batchnorm(x, g=g3, b=b3))
    h3 = dnn_conv(h3, w3, subsample=(2, 2), border_mode=(2, 2))

    h2 = relu(batchnorm(h3, g=g2, b=b2))
    h2 = dnn_conv(h2, w2, subsample=(2, 2), border_mode=(2, 2))

    h2 = T.flatten(h2,2)
    h = relu(batchnorm(h2, g=g, b=b))
    h = T.dot(h, w.T)
    return h

but got that for no matter what the input is, i get the same output
so i figured that batchnorm is in training mode, so i tried to remove batchnorm as well
but still got very weird results

any advice?

cheers,
SH

No batch norm in last layer

This is more of a doubt than an issue.
Why you don't add batch norm to the last layer of your discriminator and generator?

[request]Figure 5 of DCGAN paper implementation

Hi,
I'm interested in the contents which is in section 6.2 VISUALIZING THE DISCRIMINATOR FEATURES of DCGAN paper.
I'm not sure I could understanding this part but, I failed to implement it.
I refered to #13 of the following page
https://github.com/Hvass-Labs/TensorFlow-Tutorials

please give me some tip or implementation code (doesn't matter at any code)
thanks in advance

Do you have a docker image of DCGAN?

Hi Alec,

I read from the Indico blog that you guys use docker a lot, just wondering if you have a docker copy of DCGAN handy to share? Thanks! Really cool work by the way!

Xinxin

miss bias in the first layer of the discriminator

referring to https://github.com/Newmu/dcgan_code/blob/master/faces/train_uncond_dcgan.py#L118
I tried adding my own bias just before the lrelu, but it is hard to tell if it helps:

b = bias_ifn((ndf), 'db')
...
h = lrelu(dnn_conv(X, w, subsample=(2, 2), border_mode=(2, 2))+b.dimshuffle('x', 0, 'x', 'x'))

How to generate a single sample?

Could someone tell me how to generate a single sample? For a single input, batchnorm changes it to [0, 0, 0...], so generated images are always the same no matter what the input is.

Batch normalization and inference in the DCGAN model

I am using the DCGAN code and pretty happy with the results. However, I am curious, should one not treat the Batch Normalization operation in a special way when doing inference (after training is completed) ?

i) the original BatchNorm paper mentions that we need to freeze the mean and variance when doing inference with the model https://arxiv.org/pdf/1502.03167v3.pdf , algorithm 2

ii) the DCGAN does not use this fixing of the statistics of the batch, so when we generate new samples with the _gen function it seem we calculate on the fly the batch norm statistics. This still works and produces nice images, to my surprise

iii) now here is a case when it does not work: start with a black image X and optimize it with respect to the discriminator function to make it close to the "true" images. With few iterations of gradient descent I can get an X image which is predicted as 1 (true), but it looks pretty much also black. So basically, the discriminator seems to be pretty bad in that case, even though the images I can generate are quite good. My guess would be that the batch normalization fails in that case, since the statistics of the single black image are totally different than the statistics of a proper random minibatch.

iv) has anyone implemented a fixing of the mini batch parameters for inference, as advocated in the original paper? This might be an useful option for the DCGAN code.

v) as next experiment, I will try to remove batch normalization and train without it, and than see whether my black image experiment will work correctly

if anyone has more insights about the use of batch normalization in the DCGAN it will be really helpful to discuss that, or to get the code for a simple modification of DCGAN in order to use fixed batch normalization operation when doing inference.

thanks a lot
Nikolay

newmu / dcgan_code Goto Github PK

dcgan_code's Issues

Recommend Projects

Recommend Topics

Recommend Org