Coder Social home page Coder Social logo

pytorchneuralstyletransfer's People

Contributors

albertpi-git avatar leongatys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorchneuralstyletransfer's Issues

division by zero error on L-BGFS / python3.6 no cuda

Thanks for the code Leon, cloning and running on linux, python 3.6 with cuda, works just fine for source images or any other img i threw at this.
Running on py3.6 no cuda on a mac I got a division by zero on the optimizer
Not that it makes any sense to run this without cuda but thought you ought to know.
Running it with RMSProp this worked - although results were less pronounced.

ZeroDivisionError Traceback (most recent call last)
in ()
20 return loss
21
---> 22 optimizer.step(closure)
23
24 #display result

~/anaconda3/envs/abc/lib/python3.6/site-packages/torch/optim/lbfgs.py in step(self, closure)
151
152 # update scale of initial Hessian approximation
--> 153 H_diag = ys / y.dot(y) # (y*y)
154
155 # compute the approximate (L-BFGS) inverse Hessian

ZeroDivisionError: float division by zero

Please add any kind of license

MIT or Apache or even BSD. Or of course anything else, it's your code after all. :P Just anything to get a handle on this because as it stands, this code is "look don't touch", which is really a shame. :)

Confusion in gram matrix normalization division step's denominator values

In GramMatrix class, at 3rd cell and 7th line, shouldn't we divide the gram matrix by the product of channels, height, and width (i.e. G.div_(c*h*w) instead of G.div_(h*w))? Or am I missing something?

Edit: I do notice better results with your gram matrix implementation but not sure how this is a correct normalization. 🤔

Imagenet normalization

The pytorch docs (link) say to normalize images via

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

However, the notebook in this repo normalizes via

 transforms.Normalize(mean=[0.40760392, 0.45795686, 0.48501961],
                                                std=[1,1,1]

I am trying to re-create the results from the original paper, so I am just curious about this. Is this method of normalization specific to this task, did the imagenet normalizations for pytorch change over time, or is there some other reason I may be missing?

Runtime Error: Mismatch of size of tensor

Hi, I have been trying to debug the whole day, but have yet to solve the problem that only occurred to some images.

Below is the error message:

Traceback (most recent call last):
  File "NeuralStyleTransfer.py", line 279, in <module>
    optimizer.step(closure)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/optim/lbfgs.py", line 103, in step
    orig_loss = closure()
  File "NeuralStyleTransfer.py", line 268, in closure
    layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a, A in enumerate(out)]
  File "NeuralStyleTransfer.py", line 268, in <listcomp>
    layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a, A in enumerate(out)]
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 443, in forward
    return F.mse_loss(input, target, reduction=self.reduction)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/functional.py", line 2256, in mse_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/functional.py", line 62, in broadcast_tensors
    return torch._C._VariableFunctions.broadcast_tensors(tensors)
RuntimeError: The size of tensor a (159) must match the size of tensor b (160) at non-singleton dimension 3

Really appreciate for any assistance.

loss = sum(layer_losses)

Hello, I am new to pytorch and this scene in general, but when I run the code I get an error;torch.sum() Use tensor.detach().numpy() instead.

I am not familiar with how to detach or couldnt find an example of detaching each tensor on the list.
Is the following equivalent code?
:
for elem in layer_losses[1:]:
loss += elem

Runtime error when moving vgg to FP16

This is indeed a pytorch issue, not the issue of the original code by Leon.

I am trying to speed up the neural style transfer on Nvidia Tesla V100 by using FP16.
I modified the code to move the vgg to cuda().half(). In addition, all three images, style image, content image, and opt_img, are in FP16. I tried to keep the loss layers in FP32 because it easily can generate NaN and infinity in FP16.
The code is at https://gist.github.com/michaelhuang74/009e149a2002b84696731fb599408c90

When I ran the code, I encountered the following error.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Traceback (most recent call last):
File "neural-style-Gatys-half.py", line 167, in
style_targets = [GramMatrix()(A).detach().cuda() for A in vgg(style_image, style_layers)]
File "/home/mqhuang2/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 319, in call
result = self.forward(input, **kwargs)
File "neural-style-Gatys-half.py", line 86, in forward
G.div_(h
w)
RuntimeError: value cannot be converted to type Half without overflow: 960000
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

It seems that although I tried to keep the GramMatrix and loss functions in FP32, somehow, pytorch tried to convert FP32 to FP16 in the GramMatrix forward() method.

Any idea how to resolve this error?

Loss value not decreasing with LBFGS

I run the same code, with the output initialized to the content image. When running the optimization with LBFGS, the loss values does not decrease.

Iteration: 50, loss: 479620896.000000
Iteration: 100, loss: 479620896.000000
....

There are no updates to opt_img at all. Is there any reason that this could be happening?
EDIT: There is an exploding gradient as well. I am wondering if there is any clamping that is required.

How the "vgg_conv.pth" is generated?

I found that "vgg_conv.pth" only has the conv layers of the original vgg net, and the fc layers are removed. May I ask that how this model is generated?

Why cov4 layer is repeated 2 times in VGG module ?

I am a beginner in ANN and I am working on your paper on style transfer.

But I have a question regarding defining VGG module. Why there is layer4 is repeated 2 times? (Check below code)
Thanks in advance.

    self.conv4_1 = nn.Conv2d(256, 512, kernel_size=3, padding=1)
      self.conv4_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_4 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_1 = nn.Conv2d(256, 512, kernel_size=3, padding=1)
      self.conv4_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_4 = nn.Conv2d(512, 512, kernel_size=3, padding=1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.