<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

program will crash because of line 1473 in nvmatrix/src/nvmatrix.cu about cuda-convnet2 HOT 5 OPEN

preddy5 commented on June 18, 2024

program will crash because of line 1473 in nvmatrix/src/nvmatrix.cu

from cuda-convnet2.

Comments (5)

GoogleCodeExporter commented on June 18, 2024

This confuses me. cudaTextureObject_t is an unsigned long long, so the 
comparison with zero should be fine. I'll need more details to reproduce this. 
I've never seen it myself.

Original comment by [email protected] on 4 Aug 2014 at 6:40

from cuda-convnet2.

GoogleCodeExporter commented on June 18, 2024

I can't upload snapshot, so list related code as fellow :



1458 cudaTextureObject_t NVMatrix::getTextureObject() {
1459    if (_texObj == 0) {
1460        assert(isContiguous());
 1461        //size_t memFree, memTotal;
1462 
1463        struct cudaResourceDesc resDesc;
1464        memset(&resDesc, 0, sizeof(resDesc));
1465        resDesc.resType = cudaResourceTypeLinear;
1466        resDesc.res.linear.devPtr = getDevData();
 1467        resDesc.res.linear.sizeInBytes = getNumDataBytes();
1468        resDesc.res.linear.desc = cudaCreateChannelDesc(32, 0, 0, 0, 
cudaChannelFormatKindFloat);
1469        struct cudaTextureDesc texDesc;
1470        memset(&texDesc, 0, sizeof(texDesc));
 1471        checkCudaErrors(cudaCreateTextureObject(&_texObj, &resDesc, &texDesc, NULL));
1472    }
1473    assert(_texObj != 0);
1474    return _texObj;
1475 }



_texObj returned by line 1471 is ok if it is zero, but that will make line 1473 
fail.

Original comment by [email protected] on 5 Aug 2014 at 1:41

from cuda-convnet2.

GoogleCodeExporter commented on June 18, 2024

Oh, so you're saying that 0 is a valid value for _texObj that might be set by 
cudaCreateTextureObject. I didn't realize this. I'll have to work around that 
somehow then. Thanks.

Original comment by [email protected] on 11 Aug 2014 at 6:30

from cuda-convnet2.

GoogleCodeExporter commented on June 18, 2024

I  think why you using cudaTextureObject_t is because you want to 
utilize readonly cache in GK110. Another way to use the readonly cache 
is using const __restrict__ pointer, such as const float* __restrict__ 
images. that will solve this bug, and resolve the memory amount 
limitation problem of texture and makes code looks better

hope that information will hope.

BTW, I have worked at Baidu Company for six months, my boss is Ren Wu, 
he say you are his friend:).

于 2014/8/12 星期二 2:30, [email protected] 写道:

Original comment by [email protected] on 12 Aug 2014 at 12:48

from cuda-convnet2.

GoogleCodeExporter commented on June 18, 2024

Texture memory is (for mysterious reasons) still pretty noticeably faster than 
__restrict__ pointers in the cases where I use it, but I'll keep this in mind, 
thanks.

Original comment by [email protected] on 12 Aug 2014 at 6:27

from cuda-convnet2.

program will crash because of line 1473 in nvmatrix/src/nvmatrix.cu about cuda-convnet2 HOT 5 OPEN

Comments (5)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent