Coder Social home page Coder Social logo

pixel-cnn-pp's People

Contributors

hendrycks avatar pclucas14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pixel-cnn-pp's Issues

Request for version of PyTorch?

Could you specify which version of pytorch this works on? I have been trying with pytorch 1.0 and seem to be running into errors. Thanks!

Pretrained model request after higher number of epochs, possibly end of training

Hi Lucas, thanks for posting some of the pretrained models.
Based on your code, it appears that the pretrained models are after 789, 889 epochs. Is that correct?
In the code, it appears much longer than that. Is it possible to share the trained model after higher epoch counts, around 5000?

Also, is it possible to use this pre-trained model for another dataset such as SVHN or do I need to train them from scratch?

Thanks for your response

Pre-trained Model Request

Hi:

Thanks for your amazing job.

Is it possible that you give the access of your pre-trained model?

Thanks!

How to fix GPU OOM issue for a pretrained model

Hi Lucas, thanks for creating this pytorch-based framework.
I am running into "Cuda out of memory " error when I try to load a pre-trained model "pcnn_lr.0.00040_nr-resnet5_nr-filters160_319.pth" for line #106 in main.py
How should I fix it?

is the causality constrain satisfied?

Hi, thanks for the implementation.

The idea behind autoregressive modeling (pixelcnn, pixelcnn++,...) is that pixels are generated sequentially depending on the previous pixels. it is so called causality constraint.

with up and down sampling, i think the constraint is not satisfied.

It could be because of up and downsampling layers in your network.

Please correct me if I am wrong.

Here is the code:

model.train(False)  # trained on cifar dataset 
data = torch.zeros(1, obs[0], obs[1], obs[2])
data = data.cuda()

data_v = Variable(data, volatile=True)
out = model(data_v, sample=True)

#check output at spatial position [20,20]
print(out[0,10:13,20,20]) # return: tensor([-0.4630, -0.0477,  0.2698],)

data1=data
#change rgb value at a future pixel in the input
data1[0,:,25,24]=1000 
data_v1 = Variable(data1, volatile=True)
out1 = model(data_v1, sample=True)

#and check the output again
print(out1[0,10:13,20,20]) #return: tensor([-0.4629, -0.0476,  0.2698])

There is just a slight difference between the two output tensors, but theoretically, they must be equal.

Problem of image channel when using my own dataset

I am wondering about what should be the proper format for input image from dataset.

I have been trying to use a set of VGLC txtfiles as the input dataset. I managed to convert the text into numpy arrays, my dataset is as following

`
class MyDataset(Dataset):
def init(self, datapath, train=False, transform=None):
self.data = []
for file_name in os.listdir(datapath):
with open(base_path + "/" + file_name, 'r') as f:
res = np.array( list( map( lambda l: [ord(c) for c in l.strip()], f.readlines() ) ) )
self.data.append(res)
self.transform = transform

def __getitem__(self, index):
    img = self.data[index]
    img = Image.fromarray(img, mode="RGB")
    if self.transform is not None:
        img = self.transform(img)
    return img, 0

def __len__(self):
    return len(self.data)

Then I apply the dataloader as followingtrain_loader = torch.utils.data.DataLoader(MyDataset(base_path,
transform=m_transforms), batch_size=args.batch_size, shuffle=True, **kwargs)

test_loader = torch.utils.data.DataLoader(MyDataset(base_path,
                transform=m_transforms), batch_size=args.batch_size, shuffle=True, **kwargs)

loss_op   = lambda real, fake : discretized_mix_logistic_loss_1d(real, fake)
sample_op = lambda x : sample_from_discretized_mix_logistic_1d(x, args.nr_logistic_mix)`

However, the expected input channels gives a error

Traceback (most recent call last):
File "maintxt1.py", line 200, in
output = model(input)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/pixel-cnn-pp/model.py", line 120, in forward
u_list = [self.u_init(x)]
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/pixel-cnn-pp/layers.py", line 53, in forward
x = self.conv(x)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 353, in forward
return self._conv_forward(input, self.weight)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 350, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [80, 4, 2, 3], expected input[4, 2, 15, 32] to have 4 channels, but got 2 channels instead

I thought this maybe related to the wrong mode for my Image.fromarray(img, mode="L"), but even if I choose RGB, it still gives
RuntimeError: The size of tensor a (16) must match the size of tensor b (15) at non-singleton dimension 3
Therefore, I am wondering how I can make the input image fit the expected input for this project.

License

Thanks for this great implementation of PixelCNN. Could you please add a license to your repo? This will allow others to reuse your code "properly".

You can use this license picker.

Thanks in advance!

Possible bug?

In utils.py:
def concat_elu(x):
""" like concatenated ReLU (http://arxiv.org/abs/1603.05201), but then with ELU """
# Pytorch ordering
axis = len(x.size()) - 3
return F.elu(torch.cat([x, -x], dim=axis))

How does PyTorch differ from Tensorflow in this regard? Why is this 3 instead of 1?

Train for my own dataset

Thank you for your advice last time, I can run the trainning process properly after that. My next goal is to train it with my own dataset(probably hundreds of images) instead of Mnist or Cifar. I have read something about the dataloader part but I am still a little confused about where and how I should do the adjusting in the code. May I get any more advice from you?

Application of the trained model

Currently I am trying to applicate some trained models to generate some images. For now, I only know that loading the model is something like the following:
model.eval() data = torch.randn(1, 3, 24, 24) # dummy data output = model(data) prediction = torch.argmax(output)
However, I am wondering that how I can get image result. Is it something like this part of your original code?
`def sample(model):
model.train(False)
data = torch.zeros(sample_batch_size, obs[0], obs[1], obs[2])
data = data.cuda()
for i in range(obs[1]):
for j in range(obs[2]):
data_v = Variable(data, volatile=True)
out = model(data_v, sample=True)
out_sample = sample_op(out)
data[:, :, i, j] = out_sample.data[:, :, i, j]
return data

sample_t = sample(model)
sample_t = rescaling_inv(sample_t)
utils.save_image(sample_t,'images/{}_{}.png'.format(model_name, epoch),
nrow=5, padding=0)
`
Or is it something else that I need to learn?

How to understand sample from softmax?

I have no idea of this line about sampling from softmax. Can you give me a help?

 temp.uniform_(1e-5, 1. - 1e-5)
 temp = logit_probs.data - torch.log(- torch.log(temp))
 _, argmax = temp.max(dim=3)

conditional generation

Hi, I am wondering if your code supports conditional generation based on label or latent code?

is it a bug?

The line 44 of layers.py is
self.conv == wn(self.conv)
Should it be self.conv = wn(self.conv)? A similar one is in the line 85.

I wonder how much it affects the performance and the pretrained models?

Loading pretrained model

Hi Lucas!

Thanks so much for sharing this implementation! :)

Ran into a snag while loading the pre-trained model. Seems like the pretrained models were saved using DataParallel so naively loading the model throws error. Just calling model = torch.nn.DataParallel(model) and then calling load_part_of_model fixes the issue.

It's a minor thing but just in case others run into the same issue.

Best,

Out of memory when running main.py

Hi there, thanks for your work.
I am running main.py on one 1080Ti GPU with memory of 11172MB.
And the parameters are all set by default.
It seems that the PixelCNN++ model has consumed all the memory and I met this error:

Traceback (most recent call last):
  File "main.py", line 130, in <module>
    output = model(input)
  File "/home/nesa320/anaconda2/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/model.py", line 139, in forward
    u, ul = self.down_layers[i](u, ul, u_list, ul_list)
  File "/home/nesa320/anaconda2/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/model.py", line 53, in forward
    ul = self.ul_stream[i](ul, a=torch.cat((u, ul_list.pop()), 1))
  File "/home/nesa320/anaconda2/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/layers.py", line 137, in forward
    x = self.conv_input(self.nonlinearity(og_x))
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/model.py", line 63, in <lambda>
    self.resnet_nonlinearity = lambda x : concat_elu(x)
  File "/home/nesa320/wsz_3160105035/pcpp-pytorch/utils.py", line 14, in concat_elu
    return F.elu(torch.cat([x, -x], dim=axis))
RuntimeError: CUDA error: out of memory

I used pdb to trace the program and it seems that a u = self.u_stream[i](u, a=u_list.pop()) operation takes about 500MB of memory. And the program ran out of memory after executing
u_out, ul_out = self.up_layers[i](u_list[-1], ul_list[-1]) twice, each execution taking about 6000MB of memory.
Can you help me with this? I don't know if it is normal with the default parameter setup.

Trouble understanding some code snippet

In discretized_mix_logistic_loss in utils.py
Image 13
It corresponds to
Image 14
But I'm confused with coeffs[:, :, :, 0, :] * x[:, :, :, 0, :] part. Why it conditions on input image x?
It seems it predicts the logistic means of G channel based on the R channel of the real image rather than on the predicted R channel.
How is input real image accessible at inference time?

PyTorch version

Great job on the implementation! I'm curious to know which version you used of Python / PyTorch etc. One of the files use

print e

So I'm guessing you're using Python 2.7, but I'm not sure which PyTorch version you used.

What is the training process?

Hello, I want to ask you a question about the training and testing of pixelcnn. In the training process, a batch of images are sent in, and the probability density of the pixels is estimated by the network. Then what? How does it generate new images through these probability densities? I haven't been able to understand the specific training process on this point.I didn't find out how it was trained in paper. If you can, please let me know.
Thanks very much.

How do I use this code?

How do you use this code, please? Including the reading of datasets, the configuration of the environment, hardware requirements, and so on? Thank you for your help.

Training time?

Hey there,

Been trying out your code and just wondering what GPU card you're using and how long it takes? I'm using a GTX 1080 and it's taking around 1 hour for 1 epoch. Does that sound reasonable to you?

We had to reduce the batch size to 64 otherwise it ran out of memory.

Thanks!

FileNotFoundError bug?

After the training starts for a while, the training process stopped in the middle and shows the following:

Traceback (most recent call last):
File "main.py", line 170, in
torch.save(model.state_dict(), 'models/{}_{}.pth'.format(model_name, epoch))
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 369, in save
with _open_file_like(f, 'wb') as opened_file:
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 234, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/tangyeping/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 215, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'models/pcnn_lr:0.00020_nr-resnet5_nr-filters160_9.pth'

I tried both cifar and mnist dataset and this problem keeps appearing . I am running the code in a remote server with 2 gpus.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.