jcjohnson / pytorch-examples Goto Github PK

View Code? Open in Web Editor NEW

4.7K 4.7K 927.0 63 KB

Simple examples to introduce PyTorch

License: MIT License

Python 100.00%

pytorch-examples's People

Contributors

Stargazers

Watchers

Forkers

daerduocarey hundred06 allensmile benjamesbabala clcarwin walkacross adamlerer acuner seitaroshinagawa xypan1232 kli-casia shikharateverest diwahars avijit9 junshengshen andreas-koukorinis jmrinaldi codeaudit zhj930924 g-wang laventura dmitryulyanov zhixinshu diegslva peratham ruiyuh happylicio wanjinchang vanpersie32 yalechang mouatez protectecho kuangchecheng rickyall statml whxloveyrh salman1993 sanket-patil chetkhatri orashi ycangus2415 gwnudt sampathweb khanimar qnix ajaytalati nafizh muriloime wkm anuragreddygv323 naruto-sasuke vybhavk zhanglae rosequ firearasi kranthik123 aniucd richoux merz9b kbrown42 nanand2 fighterlyl zhangruiskyline pradau agarwalnaimish blitu12345 jason-datasci brianliu2 tianxingyzxq ilovecv jet2016 adityay6895 jshanpp tangxinkevin bionicles jxchen01 warmspringwinds ririgriff richey07 jkhlot paojianghu leezqcst archive-git-repo liuyuuan zhangdianlei xiaochaowei zhang920714 blogpuppy louico fage2016 babooppa6 zmnoval sztudy liketheflower zhoulian zhangxd12 citysir cjnolet deciphyre bellatoris

pytorch-examples's Issues

Typo

pytorch-examples/autograd/tf_two_layer_net.py

Line 50 in 73a662b

# in TensorFlow the the act of updating the value of the weights is part of

Super Minor Typo error
"in TensorFlow the the act of updating the value of the weights is part of"
should be
"in TensorFlow the act of updating the value of the weights is part of"

And Thank You @jcjohnson for this super helpful write-up!!

已经解决了win10下的训练自己的数据问题，加Q群857449786 注明pytorch examples 共同研究

y o u s a v e d m y a s s

a s s
a s s
a s s

I am gonna step on the ass

toniiiiiiiiIIIIIIIiiiiiiiiiiiiigHT I'll TryYYyyyyyyyYYYyyyyYYyy

To be ur LovRrrRRRrrrrrrrrrr

Is this correct?

https://www.zhihu.com/question/283016683/answer/432657498

MSE without the Mean, confusing when comparing to Tensorflow

I'd recommend actually using the mean squared error for all these examples. These lines in particular are very misleading to someone not familiar with PyTorch loss function params:

# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(size_average=False)

Calling this the popular MSE loss function is confusing at best. At the very least, you should make note of what the size_average=False parameter is doing, i.e. no longer making it a MSE! That is a SE sum. New and potential PyTorch users are likely using these examples to compare to Tensorflow, and it is extremely easy to become frustrated when your losses which you think are MSE are ending up way larger compared to the same model running in TF. It's also easy to become frustrated when you start hitting NAN values a lot easier with large datasets.

Other than that, I loved the examples. An excellent introduction to an amazing package.

pytorch-examples/tensor/two_layer_net_numpy.py backprop

Shouldn't w2.t() below be grad_w2 instead ? Thanks.

grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.t().mm(grad_y_pred)
grad_h_relu = grad_y_pred.mm(w2.t())

.

Sorry for this issue. It was a mistake.

Activation not applied on the output layer

In Warm-up exercises using Numpy and PyTorch, corresponding to lines
y_pred = h_relu.dot(w2) and
y_pred = h_relu.mm(w2) respectively,
I am unable to understand why activation function is not applied to the output neuron to produce the final output? In both the examples, h_relu corresponds to the activation function on the hidden layer. Need some understanding on why this activation function is missing on outputs.

Gradient at node is with respect to what?

The backward function receives the gradient of the output Tensors with respect to some scalar value.
What is this scalar value? What does it represent in the computational graph?

typo

In the first example:

Backprop to compute gradients of w1 and w2 with respect to loss

should be

Backprop to compute gradient of loss with respect to w1 and w2

Thank you for the nice write-up!

Manually zero the gradients

pytorch-examples/autograd/two_layer_net_autograd.py

Line 37 in 659a73c

w1 = Variable(torch.randn(D_in, H).type(dtype), requires_grad=True)

According to a very recent version of pytorch, this would cause the following error: 'NoneType' object has no attribute 'data'.
Maybe we should zero the gradients after running the backward pass now.

A problem in PyTorch: Defining new autograd functions: input,=self.saved_tensors

I'm sorry to open an issue here,but I want to raise a problem in the part:"PyTorch: Defining new autograd functions",the problem is that
Below class ReLu, function backward, here is a code :input,=self.saved_tensors, when I move the "," behind "input", and run the code "loss.backward()"below,I get an error:

File "/home/ry-feng/anaconda3/envs/python36/lib/python3.6/site-packages/torch/autograd/variable.py", line 146, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
File "", line 8, in backward
TypeError: '<' not supported between instances of 'tuple' and 'int'

When I add "," ,the program can run normally, I don't know what cause this problem, I hope I could get some help from you, thanks a lot.

custom backward function

pytorch-examples/autograd/two_layer_net_custom_function.py

Line 30 in 0f1b88a

def backward(self, grad_output):

self.saved_tensors in custom backward function doesn't work hence cannot compute backward using the subclass method

PyTorch: Variables and autograd - NoneType error

When running the beginning autograd code on the ReadMe intro, I'm getting:

    w1.grad.data.zero_()
AttributeError: 'NoneType' object has no attribute 'data'

I'm using PyTorch Version: 0.1.12_2 and python 2.7

Adding some layers

Question about tutorials/beginner_source/examples_tensor/two_layer_net_numpy.py

I cannot understand the 41-43 line.Can someone explain it?

    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.T.dot(grad_y_pred)

Typo

I was going through the PyTorch documentation on the webpage:
https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#tensors

First of all, thanks a lot for the brilliant tutorial. @jcjohnson

I noticed some typing errors, and thought I should bring it to your notice.

# dtype = torch.device("cuda:0") # Uncomment this to run on GPU

This is a line which is repeated several times throughout the tutorial on the aforementioned web page, and hence might be confusing for newbies like me. I think the correct should be

# device = torch.device("cuda:0") # Uncomment this to run on GPU
Thanks,
Rajat Chhabra

pytorch code doesn't run on CUDA

Hi!

I'm trying to run the code in file tensor/two_layer_net_tensor.py which looks similar to this at the beginning

import torch

# dtype = torch.FloatTensor
dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU
print('pass1')
# N is batch size; D_in is input dimension;

The code works fine when my dtype is torch.FloatTensor, but when I run it on GPU it gets stuck at h = x.mm(w1). I verified that print(torch.cuda.is_available()) returns true, which means GPU functionality is working fine so I'm at a loss as to what the issue is.

Need help.

Why does `nn` module need a higher learning rate?

Thanks a lot for your amazing examples!

I find that in the examples using nn module the learning_rate is 1e-4, while without the nn module the learning_rate is 1e-6.

I did not figure out the reason why those examples using nn require larger learning_rate🙁

in the warm up example, can you elaborate on the backprop ?

In the backprop for the warm up example, to obtain the grad_w2, why is h_relu.T is required. when the derivative of dY_pred/dW2 = h_relu and DLoss/DY_pred = 2(y_pred - y), from chain rule, we obtain Dloss/Dw2 as h_relu * 2(y_pred - y) not h_relu.T * 2 (y_pred - y).

Can you explain why it is h_relu.T ???

typo

when it says:
"after backpropagation x.grad will be another Tensor holding the gradient of x with respect to some scalar value."

it should be
"after backpropagation x.grad will be another Tensor holding the gradient of some scalar value with respect to x."

pytorch overhead

Hey @jcjohnson, sorry for opening an issue here, didn't know how to reach You without e-mailing on Your Stanford inbox.

I haven't used torch or Lua, but I remember some of my friends talking about your implementation of char-rnn in Lua. They said it was super fast.

I'm wondering if it is possible to do something like that in PyTorch? Or the speed was thanks to Lua's JIT compiler, and Python interpreter will simply incur too much overhead? In general, do You think PyTorch is suitable for applications with lots of small computations (char-lvl, pixel-lvl stuff)?

Example doesn't run on gpu (when uncommenting)

# dtype = torch.device("cuda:0") # Uncomment this to run on GPU
should be
# device = torch.device("cuda:0") # Uncomment this to run on GPU

typo

In Autograd:

If x is a Tensor that has x.requires_grad=True then x.grad is another Tensor holding the gradient of x with respect to some scalar value.

should be

If x is a Tensor that has x.requires_grad=True then x.grad is another Tensor holding the gradient of scalar (usually loss) with respect to x.

Question : Warm-up: numpy

When you compute loss, why don't you divide by a batch size to get mean squared error?

Why is the prediction on the first weight clamped?

Why do we need to call the clamp function on the prediction for the first layer?

Typo...

Hey @jcjohnson, first of all thank you for these, eternally thankful...
The issue-
4th paragraph Pytorch:Autograd -
"for example we usually don't want to backpropagate through the weight update steps when training a neural network"

This should be done when the network is being evaluated too right? At that time we don't want extra memory to be used(to keep track) if we aren't going to update the weights

Issue on Windows

I try to install pytorch and after days of trying Im here with a big, big problem. I read a lot of articles of "how to install pytorch" I try to install with pip install but dont work for me and after I install it with Anaconda, but in anaconda is pytorch install, when I type: conda list, he is there like this form: pytorch 1.0.1 py3.7_cuda100_cudnn7_1 pytorch, I have python 3.7, when I run a code with import torch this show me a message like this:

And when i try to import torch in python 3.7:

Pip install error:

How to pass this errors? Please Help, thx.

Why making gradients zero before backward pass?

Corresponding to the line
model.zero_grad() in the
section "nn", I want to understand why do we need to make gradients zero before the backward pass?

Theoretically, isnt the backward pass supposed to utilize existing gradients to correct the weights? How setting them zero should help?

jcjohnson / pytorch-examples Goto Github PK

pytorch-examples's People

Contributors

Stargazers

Watchers

Forkers

pytorch-examples's Issues

Recommend Projects

Recommend Topics

Recommend Org