pclucas14 / pixel-cnn-pp Goto Github PK

View Code? Open in Web Editor NEW

345.0 9.0 75.0 7.8 MB

Pytorch Implementation of OpenAI's PixelCNN++

License: Other

Python 100.00%

pixel-cnn-pp's People

Contributors

Stargazers

Watchers

Forkers

atabakd shubhampachori12110095 cw-huang vlievin aman-tiwari eywalker kastnerkyle vzhuang stevenlol rownine ashafaei smallflyingpig xuexia7023 supportingvector jramapuram lmm077 abhinavmadahar ibulu saizhuowang bastien-unchartech vsub21 pkalluri aihill jizhihang johnliuzx mihdalal xhufdd wz1023567045 arranger1044 xqterry gaobb ashishgaurav13 ydawei rasheddoha thanhlv influencefunctional yxinjiang noahgolmant wizgrao zwlanpishu swyoon lee-man vahidzee dikshameghwal izmailovpavel kdhingra307 zhuqx823 zhenlan0426 moh-mah smilemcm simenglv jfrancis71 tom-ryder alexhepburn dankydev 13269562786 pombredanne evilperfectionist changwoolee kai-wen-yang andiac ysjsbz jprachir ps789 josejhlee hankyul2 david-che sunpro108 amadfat yiboyang

pixel-cnn-pp's Issues

Request for version of PyTorch?

Could you specify which version of pytorch this works on? I have been trying with pytorch 1.0 and seem to be running into errors. Thanks!

Pretrained model request after higher number of epochs, possibly end of training

Hi Lucas, thanks for posting some of the pretrained models.
Based on your code, it appears that the pretrained models are after 789, 889 epochs. Is that correct?
In the code, it appears much longer than that. Is it possible to share the trained model after higher epoch counts, around 5000?

Also, is it possible to use this pre-trained model for another dataset such as SVHN or do I need to train them from scratch?

Thanks for your response

Pre-trained Model Request

Hi:

Thanks for your amazing job.

Is it possible that you give the access of your pre-trained model?

Thanks!

How to fix GPU OOM issue for a pretrained model

Hi Lucas, thanks for creating this pytorch-based framework.
I am running into "Cuda out of memory " error when I try to load a pre-trained model "pcnn_lr.0.00040_nr-resnet5_nr-filters160_319.pth" for line #106 in main.py
How should I fix it?

is the causality constrain satisfied?

Hi, thanks for the implementation.

The idea behind autoregressive modeling (pixelcnn, pixelcnn++,...) is that pixels are generated sequentially depending on the previous pixels. it is so called causality constraint.

with up and down sampling, i think the constraint is not satisfied.

It could be because of up and downsampling layers in your network.

Please correct me if I am wrong.

Here is the code:

model.train(False)  # trained on cifar dataset 
data = torch.zeros(1, obs[0], obs[1], obs[2])
data = data.cuda()

data_v = Variable(data, volatile=True)
out = model(data_v, sample=True)

#check output at spatial position [20,20]
print(out[0,10:13,20,20]) # return: tensor([-0.4630, -0.0477,  0.2698],)

data1=data
#change rgb value at a future pixel in the input
data1[0,:,25,24]=1000 
data_v1 = Variable(data1, volatile=True)
out1 = model(data_v1, sample=True)

#and check the output again
print(out1[0,10:13,20,20]) #return: tensor([-0.4629, -0.0476,  0.2698])

There is just a slight difference between the two output tensors, but theoretically, they must be equal.

Problem of image channel when using my own dataset

I am wondering about what should be the proper format for input image from dataset.

I have been trying to use a set of VGLC txtfiles as the input dataset. I managed to convert the text into numpy arrays, my dataset is as following

`
class MyDataset(Dataset):
def init(self, datapath, train=False, transform=None):
self.data = []
for file_name in os.listdir(datapath):
with open(base_path + "/" + file_name, 'r') as f:
res = np.array( list( map( lambda l: [ord(c) for c in l.strip()], f.readlines() ) ) )
self.data.append(res)
self.transform = transform

def __getitem__(self, index):
    img = self.data[index]
    img = Image.fromarray(img, mode="RGB")
    if self.transform is not None:
        img = self.transform(img)
    return img, 0

def __len__(self):
    return len(self.data)

Then I apply the dataloader as followingtrain_loader = torch.utils.data.DataLoader(MyDataset(base_path,
transform=m_transforms), batch_size=args.batch_size, shuffle=True, **kwargs)

test_loader = torch.utils.data.DataLoader(MyDataset(base_path,
                transform=m_transforms), batch_size=args.batch_size, shuffle=True, **kwargs)

loss_op   = lambda real, fake : discretized_mix_logistic_loss_1d(real, fake)
sample_op = lambda x : sample_from_discretized_mix_logistic_1d(x, args.nr_logistic_mix)`

However, the expected input channels gives a error

Traceback (most recent call last):
File "maintxt1.py", line 200, in
output = model(input)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/pixel-cnn-pp/model.py", line 120, in forward
u_list = [self.u_init(x)]
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/pixel-cnn-pp/layers.py", line 53, in forward
x = self.conv(x)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 353, in forward
return self._conv_forward(input, self.weight)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 350, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [80, 4, 2, 3], expected input[4, 2, 15, 32] to have 4 channels, but got 2 channels instead

I thought this maybe related to the wrong mode for my Image.fromarray(img, mode="L"), but even if I choose RGB, it still gives
RuntimeError: The size of tensor a (16) must match the size of tensor b (15) at non-singleton dimension 3
Therefore, I am wondering how I can make the input image fit the expected input for this project.

License

Thanks for this great implementation of PixelCNN. Could you please add a license to your repo? This will allow others to reuse your code "properly".

You can use this license picker.

Thanks in advance!

Possible bug?

In utils.py:
def concat_elu(x):
""" like concatenated ReLU (http://arxiv.org/abs/1603.05201), but then with ELU """
# Pytorch ordering
axis = len(x.size()) - 3
return F.elu(torch.cat([x, -x], dim=axis))

How does PyTorch differ from Tensorflow in this regard? Why is this 3 instead of 1?

How to run pretrained model checkpoint?

There is no code to test, can't run forward.

Train for my own dataset

Thank you for your advice last time, I can run the trainning process properly after that. My next goal is to train it with my own dataset(probably hundreds of images) instead of Mnist or Cifar. I have read something about the dataloader part but I am still a little confused about where and how I should do the adjusting in the code. May I get any more advice from you?

Application of the trained model

Currently I am trying to applicate some trained models to generate some images. For now, I only know that loading the model is something like the following:
model.eval() data = torch.randn(1, 3, 24, 24) # dummy data output = model(data) prediction = torch.argmax(output)
However, I am wondering that how I can get image result. Is it something like this part of your original code?
`def sample(model):
model.train(False)
data = torch.zeros(sample_batch_size, obs[0], obs[1], obs[2])
data = data.cuda()
for i in range(obs[1]):
for j in range(obs[2]):
data_v = Variable(data, volatile=True)
out = model(data_v, sample=True)
out_sample = sample_op(out)
data[:, :, i, j] = out_sample.data[:, :, i, j]
return data

sample_t = sample(model)
sample_t = rescaling_inv(sample_t)
utils.save_image(sample_t,'images/{}_{}.png'.format(model_name, epoch),
nrow=5, padding=0)
`
Or is it something else that I need to learn?

How to understand sample from softmax?

I have no idea of this line about sampling from softmax. Can you give me a help?

 temp.uniform_(1e-5, 1. - 1e-5)
 temp = logit_probs.data - torch.log(- torch.log(temp))
 _, argmax = temp.max(dim=3)

Can I change the input to a normal picture instead of zeros?

Will output be similar to input if input is a normal picture like one cifar picture ?

conditional generation

Hi, I am wondering if your code supports conditional generation based on label or latent code?

is it a bug?

The line 44 of layers.py is
self.conv == wn(self.conv)
Should it be self.conv = wn(self.conv)? A similar one is in the line 85.

I wonder how much it affects the performance and the pretrained models?

Loading pretrained model

Hi Lucas!

Thanks so much for sharing this implementation! :)

Ran into a snag while loading the pre-trained model. Seems like the pretrained models were saved using DataParallel so naively loading the model throws error. Just calling model = torch.nn.DataParallel(model) and then calling load_part_of_model fixes the issue.

It's a minor thing but just in case others run into the same issue.

Best,

Out of memory when running main.py

Hi there, thanks for your work.
I am running main.py on one 1080Ti GPU with memory of 11172MB.
And the parameters are all set by default.
It seems that the PixelCNN++ model has consumed all the memory and I met this error:

Traceback (most recent call last):
  File "main.py", line 130, in <module>
    output = model(input)
  File "/home/nesa320/anaconda2/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/model.py", line 139, in forward
    u, ul = self.down_layers[i](u, ul, u_list, ul_list)
  File "/home/nesa320/anaconda2/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/model.py", line 53, in forward
    ul = self.ul_stream[i](ul, a=torch.cat((u, ul_list.pop()), 1))
  File "/home/nesa320/anaconda2/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/layers.py", line 137, in forward
    x = self.conv_input(self.nonlinearity(og_x))
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/model.py", line 63, in <lambda>
    self.resnet_nonlinearity = lambda x : concat_elu(x)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/utils.py", line 14, in concat_elu
    return F.elu(torch.cat([x, -x], dim=axis))
RuntimeError: CUDA error: out of memory

I used pdb to trace the program and it seems that a u = self.u_stream[i](u, a=u_list.pop()) operation takes about 500MB of memory. And the program ran out of memory after executing
u_out, ul_out = self.up_layers[i](u_list[-1], ul_list[-1]) twice, each execution taking about 6000MB of memory.
Can you help me with this? I don't know if it is normal with the default parameter setup.

Trouble understanding some code snippet

In discretized_mix_logistic_loss in utils.py

It corresponds to

But I'm confused with coeffs[:, :, :, 0, :] * x[:, :, :, 0, :] part. Why it conditions on input image x?
It seems it predicts the logistic means of G channel based on the R channel of the real image rather than on the predicted R channel.
How is input real image accessible at inference time?

PyTorch version

Great job on the implementation! I'm curious to know which version you used of Python / PyTorch etc. One of the files use

print e

So I'm guessing you're using Python 2.7, but I'm not sure which PyTorch version you used.

What is the training process?

Hello, I want to ask you a question about the training and testing of pixelcnn. In the training process, a batch of images are sent in, and the probability density of the pixels is estimated by the network. Then what? How does it generate new images through these probability densities? I haven't been able to understand the specific training process on this point.I didn't find out how it was trained in paper. If you can, please let me know.
Thanks very much.

How do I use this code?

How do you use this code, please? Including the reading of datasets, the configuration of the environment, hardware requirements, and so on? Thank you for your help.

Training time?

Hey there,

Been trying out your code and just wondering what GPU card you're using and how long it takes? I'm using a GTX 1080 and it's taking around 1 hour for 1 epoch. Does that sound reasonable to you?

We had to reduce the batch size to 64 otherwise it ran out of memory.

Thanks!

FileNotFoundError bug?

After the training starts for a while, the training process stopped in the middle and shows the following:

Traceback (most recent call last):
File "main.py", line 170, in
torch.save(model.state_dict(), 'models/{}_{}.pth'.format(model_name, epoch))
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 369, in save
with _open_file_like(f, 'wb') as opened_file:
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 234, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 215, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'models/pcnn_lr:0.00020_nr-resnet5_nr-filters160_9.pth'

I tried both cifar and mnist dataset and this problem keeps appearing . I am running the code in a remote server with 2 gpus.