Coder Social home page Coder Social logo

cudnnConvolutionBackwardData failed - Error in CuDNN: CUDNN_STATUS_NOT_SUPPORTED (cudnnConvolutionBackwardData) about cudnn.torch HOT 8 OPEN

ProGamerGov avatar ProGamerGov commented on June 2, 2024
cudnnConvolutionBackwardData failed - Error in CuDNN: CUDNN_STATUS_NOT_SUPPORTED (cudnnConvolutionBackwardData)

from cudnn.torch.

Comments (8)

ProGamerGov avatar ProGamerGov commented on June 2, 2024

I have been trying to push things as far as they can go, and may have hit a limit in Torch7 and/or cuDNN, because search engines don't really show anything for this error.

I was running the latest version of Torch, Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-1038-aws x86_64), and Cuda 9.0, with cuDNN v7.

from cudnn.torch.

ProGamerGov avatar ProGamerGov commented on June 2, 2024

I assume this error is because of a limitation in the maximum value possible? So this maximum could be changed?

from cudnn.torch.

ProGamerGov avatar ProGamerGov commented on June 2, 2024

The error appears to come from these areas:

In SpatialConvolution.lua, on line 201: https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua#L201

In SpatialConvolution.lua, on line 209: https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua#L209

@soumith How do I fix this limitation?

from cudnn.torch.

ProGamerGov avatar ProGamerGov commented on June 2, 2024

Related Issues:

jzbontar/mc-cnn#16

allenai/XNOR-Net#22

soumith/dcgan.torch#67

facebookarchive/fb.resnet.torch#153

from cudnn.torch.

ProGamerGov avatar ProGamerGov commented on June 2, 2024

After using cudnn.verbose = true, it seems that it may be a lack of memory issue after all:

https://gist.github.com/ProGamerGov/9e5b367a90cd4be9cbd1ed023dafbb81

I thought I could go a lot higher in terms of image size in Neural-Style, but I did that one the install with an earlier version of Torch and Cuda/cuDNN. Either Torch7 or Cuda/cuDNN has gotten more inefficient, and that is probably why I can't get any higher in terms of image size: jcjohnson/neural-style#429

from cudnn.torch.

ngimel avatar ngimel commented on June 2, 2024

Try limiting your workspace size by setting cudnn.maxWorkspaceGPUMemPercent (say, to 30 or 40)

from cudnn.torch.

Kevinpsk avatar Kevinpsk commented on June 2, 2024

Hi guys, I was wondering if any of you has any progress on this problem. I have a similar error with cudnnConvolutionBackwardFilter. See below for the full error message,

cudnnConvolutionBackwardFilter failed: 9 convDesc=[mode : CUDNN_CROSS_CORRELATION datatype : CUDNN_DATA_FLOAT] hash=-dimA93700,3,20,9 -filtA10,3,9,9 93700,10,12,1 -padA0,0 -convStrideA1,1 CUDNN_DATA_FLOAT /usr/local/mnt/vega_scratch/scratch/bio_vad/src/torch/install/bin/luajit: ...bio_vad/src/torch/install/share/lua/5.1/nn/Container.lua:67: In 1 module of nn.Sequential: In 2 module of nn.Sequential: ...h/bio_vad/src/torch/install/share/lua/5.1/cudnn/find.lua:94: Error in CuDNN: CUDNN_STATUS_NOT_SUPPORTED (cudnnConvolutionBackwardFilter) stack traceback: [C]: in function 'error' ...h/bio_vad/src/torch/install/share/lua/5.1/cudnn/find.lua:94: in function 'checkedCall' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:264: in function 'accGradParameters' ...ch/bio_vad/src/torch/install/share/lua/5.1/nn/Module.lua:32: in function <...ch/bio_vad/src/torch/install/share/lua/5.1/nn/Module.lua:29> [C]: in function 'xpcall' ...bio_vad/src/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' ...io_vad/src/torch/install/share/lua/5.1/nn/Sequential.lua:87: in function <...io_vad/src/torch/install/share/lua/5.1/nn/Sequential.lua:81> [C]: in function 'xpcall' ...bio_vad/src/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' ...io_vad/src/torch/install/share/lua/5.1/nn/Sequential.lua:91: in function 'backward' ...ai/code/CLVTtorch/CLVT_SSF_Trainer/train_noSequencer.lua:106: in function 'opfunc' ...o_vad/src/torch/install/share/lua/5.1/optim/adadelta.lua:31: in function 'optimMethod' ...ai/code/CLVTtorch/CLVT_SSF_Trainer/train_noSequencer.lua:212: in main chunk [C]: in function 'dofile' ...ode/CLVTtorch/CLVT_SSF_Trainer/trainCLVT_noSequencer.lua:124: in main chunk [C]: in function 'dofile' .../src/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x004064f0
Is this a memory issue?

Cheers

from cudnn.torch.

ChangshiFan avatar ChangshiFan commented on June 2, 2024

@ProGamerGov Do you have solved this problem?

from cudnn.torch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.