Coder Social home page Coder Social logo

pytorch-beginner's Introduction

pytorch-beginner

Toy project for pytorch beginner with simplest code.

Requirements

python 3.7 pytorch 1.0.0+

pytorch-beginner's People

Contributors

dondon2475848 avatar fhk avatar hyeongminmoon avatar jaeyung1001 avatar l1aoxingyu avatar leiwang1999 avatar liweiwei1419 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-beginner's Issues

embeddings训练问题

您好,关于这份代码有两个 问题想请教:

  1. 训练词向量为什么要对label也加上Variable呢?
  2. 这样训练完成后,如何保存训练好的shape为(vocab_size, word_dim)的词向量呢?

感谢您分享的代码和文章,希望能得到您的回复!

07-Language Model 缺少文件

请问第7个项目没有例子吗,我发现./data/文件夹下的文件并不存在,无法运行

其次项目中有很多用来老版本pytorch所导致的问题(并不像README所说使用pytorch1.0),希望作者有时间可以进行改进 : )

rcnn

请问一下,rcnn训练的时候如果label时不定长的,应该如何处理?

A puzzle about the model in chapter 2.

The model in chapter 2 should be multilayer perceptron instead of logistic regression, since which can only be used in two classfication problem. How about changing the model to softmax which is available for multi-classification problem? thanks.

浮点数计算的小bug

Logistic_Regression.py和neural_network.py源代码里的都存在一个相同的浮点数计算的小bug:
eval_loss = 0
eval_acc = 0
需要做一下修改,方式有两种:
1)方式一:
eval_loss = 0.0
eval_acc = 0.0
2)方式二:
在源代码文件第一行,加上“from future import division”

Loss and training

Hi, I think that there is a few mistakes in the simple and convolutional autoencoders :

  • The displayed loss is the loss on the last image of the epoch instead of the loss over the whole epoch
  • The autoencoder is not tested on the test dataset
  • The autoencoder is never in "train" or "test" mode
    Is it normal ?

Zero accuracy in 02 logistic regression

In 02 Logistic Regression, all accuracies are zero. Need to convert torch tensor to float.

line 68: running_acc += num_correct.data[0] --> running_acc += num_correct.data[0].float()
line 98: eval_acc += num_correct.data[0] --> eval_acc += num_correct.data[0].float()

04-Convolutional-Neural-Network两个问题

1from logger import Logger有警告
2D:\Anaconda3\python.exe G:/pytorch/pytorch-beginner-master/04-Convolutional-Neural-Network/convolution_network.py
epoch 1


Traceback (most recent call last):
File "G:/pytorch/pytorch-beginner-master/04-Convolutional-Neural-Network/convolution_network.py", line 78, in
running_loss += loss.data[0] * label.size(0)
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

Process finished with exit code 1

微信图片_20200524170216

image

CNN

when i run the code of number 4,it embraces a error:

AttributeError: 'Cnn' object has no attribute 'named_parameters'

autoencoder

请问08章的readme里面, encoder的图是如何得到的?以及图里的-1.5到1.5代表什么含义?谢谢!

Zero accuracy in 03 Logistic Regression

Accuracies are all zeros. Need to convert torch tensor to float

line 68: running_acc += num_correct.data[0] -->running_acc += num_correct.data[0].float()

line 98: eval_acc += num_correct.data[0] --> eval_acc += num_correct.data[0].float()

Wrong code

There is something wrong in the normalization part of the dataset; Also, 'save_image' does not accept the ‘illegal’ input. Users should fix these bugs.

issues about GAN

@L1aoXingyu 你好,GAN网络的训练判别器的时候不是要把生成器固定住吗?但是,代码中并没有哪一步把生成器的参数固定住。这样在训练判别器的时候,生成器的参数也会进行梯度更新的吧?

Logistic Regression: the printing accept rate will always be 0 bcz of feature of torch 0.4.0

第二章,logistic regression 代码:
line 65 report warnning:

UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number

this will cause an error in line 74: the value of running_acc / (batch_size * i)) will always be 0 because it's a tensor devide a number, this operation is no longer supported in latest torch version.

One solution is to modify line 65, use tensor.item()
running_acc += num_correct.data[0].item()

Error for MNIST autoEncoder

Your file produces Error:
RuntimeError: output with shape [1, 28, 28] doesn't match the broadcast shape [3, 28, 28]

It probably due to the gray-scale image downloaded automatically.

why can‘t I set a batch size larger than 12?

if I set batch_size=32 it returns "Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)"
but when I set batch_size=10(<12) the model works normally..
Dose someone had the same problem with me ?

RuntimeError: CUDNN_STATUS_INTERNAL_ERROR_04-Convolutional Neural Network

pytorch环境(ubuntu16.04)

使用anocoda上安装的pytorch
cudatoolkit: 8.0-3
cudnn: 7.0.5-cuda8.0_0
pytorch: 0.3.0-py35cuda8.0cudnn7.0_0

运行04,碰到关于cudnn的问题, 错误信息如下:

epoch 1

Traceback (most recent call last):
File "convolution_network.py", line 76, in
out = model(img)
File "/home/yjx/.conda/envs/pytorch/lib/python3.5/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "convolution_network.py", line 48, in forward
out = self.conv(x)
File "/home/yjx/.conda/envs/pytorch/lib/python3.5/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/yjx/.conda/envs/pytorch/lib/python3.5/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/home/yjx/.conda/envs/pytorch/lib/python3.5/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/yjx/.conda/envs/pytorch/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 277, in forward
self.padding, self.dilation, self.groups)
File "/home/yjx/.conda/envs/pytorch/lib/python3.5/site-packages/torch/nn/functional.py", line 90, in conv2d
return f(input, weight, bias)
RuntimeError: CUDNN_STATUS_INTERNAL_ERROR

代码测试了自己的cuda和cudnn

CUDA TEST
import torch
x = torch.Tensor([1.0])
xx = x.cuda()
print(xx)
CUDNN TEST
from torch.backends import cudnn
print(cudnn.is_acceptable(xx))
~
显示是正常可用,

Reccurent network can't work

Reccurent network can't work.Information is as follows:
D:\Program Files (x86)\Anaconda3\python.exe" F:/py/pytorch_LSTM.py
Traceback (most recent call last):
File "F:/py/pytorch_LSTM.py", line 47, in
model = model.cuda()
File "D:\Program Files (x86)\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 147, in cuda
return self._apply(lambda t: t.cuda(device_id))
File "D:\Program Files (x86)\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 118, in _apply
module._apply(fn)
File "D:\Program Files (x86)\Anaconda3\lib\site-packages\torch\nn\modules\rnn.py", line 116, in apply
self.flatten_parameters()
File "D:\Program Files (x86)\Anaconda3\lib\site-packages\torch\nn\modules\rnn.py", line 95, in flatten_parameters
fn.rnn_desc = rnn.init_rnn_descriptor(fn, handle)
File "D:\Program Files (x86)\Anaconda3\lib\site-packages\torch\backends\cudnn\rnn.py", line 54, in init_rnn_descriptor
fn.datatype
File "D:\Program Files (x86)\Anaconda3\lib\site-packages\torch\backends\cudnn_init
.py", line 229, in init
if version() >= 6000:
TypeError: '>=' not supported between instances of 'NoneType' and 'int'

acc 计算错误

文件:05-Recurrent Neural Network/recurrent_network.py
例如:line 86
if i % 300 == 0:
print('[{}/{}] Loss: {:.6f}, Acc: {:.6f}'.format(
epoch + 1, num_epoches, running_loss / (batch_size * i),
running_acc / (batch_size * i)))
错误值:running_loss / (batch_size*i) 等于零
原因: running_loss 为int类型
解决方法: running_loss.double()

RuntimeError: output with shape [1, 28, 28] doesn't match the broadcast shape [3, 28, 28]

RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torchvision/datasets/mnist.py", line 95, in getitem
img = self.transform(img)
File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py", line 70, in call
img = t(img)
File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py", line 175, in call
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py", line 217, in normalize
tensor.sub
(mean[:, None, None]).div
(std[:, None, None])
RuntimeError: output with shape [1, 28, 28] doesn't match the broadcast shape [3, 28, 28]

RuntimeError: cudnn RNN backward can only be called in training mode

when I run this code called: recurrent_network.py which in pytorch-beginner-master\05-Recurrent Neural Network

a error come:
Traceback (most recent call last): File "recurrent_network.py", line 83, in <module> loss.backward() File "C:\Users\yuanz\Miniconda3\envs\py36\lib\site-packages\torch\tensor.py", line 102, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "C:\Users\yuanz\Miniconda3\envs\py36\lib\site-packages\torch\autograd\__init__.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: cudnn RNN backward can only be called in training mode

issue with pytorch-beginner/05-Recurrent Neural Network/recurrent_network.py

  1. need to change .data[0] => .item()

  2. add model.train() at beginning of the loop

Only need to modify the training loop code, below is the fixed code worked for me :)


for epoch in range(num_epoches):
    model.train()
    print('epoch {}'.format(epoch + 1))
    print('*' * 10)
    running_loss = 0.0
    running_acc = 0.0
    for i, data in enumerate(train_loader, 1):
        img, label = data
        b, c, h, w = img.size()
        assert c == 1, 'channel must be 1'
        img = img.squeeze(1)
        # img = img.view(b*h, w)
        # img = torch.transpose(img, 1, 0)
        # img = img.contiguous().view(w, b, -1)
        if use_gpu:
            img = Variable(img).cuda()
            label = Variable(label).cuda()
        else:
            img = Variable(img)
            label = Variable(label)
            
        
        # 向前传播
        out = model(img)
        loss = criterion(out, label)
        running_loss += loss.item() * label.size(0)
        _, pred = torch.max(out, 1)
        num_correct = (pred == label).sum()
        running_acc += num_correct.item()
        # 向后传播
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if i % 300 == 0:
            print('[{}/{}] Loss: {:.6f}, Acc: {:.6f}'.format(
                epoch + 1, num_epoches, running_loss / (batch_size * i),
                running_acc / (batch_size * i)))
    print('Finish {} epoch, Loss: {:.6f}, Acc: {:.6f}'.format(
        epoch + 1, running_loss / (len(train_dataset)), running_acc / (len(
            train_dataset))))
    model.eval()
    eval_loss = 0.
    eval_acc = 0.
    for data in test_loader:
        img, label = data
        b, c, h, w = img.size()
        assert c == 1, 'channel must be 1'
        img = img.squeeze(1)
        # img = img.view(b*h, w)
        # img = torch.transpose(img, 1, 0)
        # img = img.contiguous().view(w, b, h)
        if use_gpu:
            img = Variable(img, volatile=True).cuda()
            label = Variable(label, volatile=True).cuda()
        else:
            img = Variable(img, volatile=True)
            label = Variable(label, volatile=True)
        out = model(img)
        loss = criterion(out, label)
        eval_loss += loss.item() * label.size(0)
        _, pred = torch.max(out, 1)
        num_correct = (pred == label).sum()
        eval_acc += num_correct.item()
    print('Test Loss: {:.6f}, Acc: {:.6f}'.format(eval_loss / (len(
        test_dataset)), eval_acc / (len(test_dataset))))
    print()


AutoEncoder一节中的transform错误

img_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

这里有问题,我按照别人其他代码里面的transform改成下面这样可以跑

img_transform= transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])

error calculation

In the autoencoder example, the values printed i.e.
print('epoch [{}/{}], loss:{:.4f}' .format(epoch + 1, num_epochs, loss.data[0]))
this is the error associated with the final batch of the data right ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.