Coder Social home page Coder Social logo

Comments (3)

sylyt62 avatar sylyt62 commented on June 1, 2024

I downgraded my cuda version to 11.1, then met other errors while testing:

test_backward (test_octree2col.Octree2ColTest) ... /media/yangtian/SATA3/PyEnvs/ubuntu/ocnn-py37-env/lib/python3.7/site-packages/torch/autograd/gradcheck.py:633: UserWarning: Input #0 requires gradient and is not a double precision floating point or complex. This check will likely fail if all the inputs are not of double precision floating point or complex.
f'Input #{idx} requires gradient and '
ok
test_forward (test_octree2col.Octree2ColTest) ... ok
test_forwardP1 (test_octree2col.Octree2ColTest) ... ok
test_octree2colP (test_octree2col.Octree2ColTest) ... ok
test_forward_backward1 (test_octree_align.OctreeAlignTest) ... ok
test_forward_backward2 (test_octree_align.OctreeAlignTest) ... ok
test_forward_backward3 (test_octree_align.OctreeAlignTest) ... ok
test_forward_and_backward (test_octree_conv.OctreeConvTest) ... ERROR
test_forward_and_backward (test_octree_deconv.OctreeDeconvTest) ... ERROR
test_decode_encode_key (test_octree_key.OctreeKeyTest) ... ERROR
test_search_key (test_octree_key.OctreeKeyTest) ... ERROR
test_xyz_key (test_octree_key.OctreeKeyTest) ... ERROR
test_xyz_key_64 (test_octree_key.OctreeKeyTest) ... ERROR
test_forward_and_backward_avg_pool (test_octree_pool.OctreePoolTest) ... ERROR
test_forward_and_backward_max_pool (test_octree_pool.OctreePoolTest) ... ERROR
test_forward_and_backward_max_unpool (test_octree_pool.OctreePoolTest) ... ERROR
test_octree_property (test_octree_property.OctreePropertyTest) ... ERROR
test_forward1 (test_octree_trilinear.OctreeTrilinearTest) ... ERROR
test_points_property (test_points_property.PointsPropertyTest) ... ok

======================================================================
ERROR: test_forward_and_backward (test_octree_conv.OctreeConvTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_conv.py", line 90, in test_forward_and_backward
self.forward_and_backward(kernel_size[j], stride[i])
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_conv.py", line 61, in forward_and_backward
out3.backward(pesudo_grad2)
File "/media/yangtian/SATA3/PyEnvs/ubuntu/ocnn-py37-env/lib/python3.7/site-packages/torch/_tensor.py", line 255, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/media/yangtian/SATA3/PyEnvs/ubuntu/ocnn-py37-env/lib/python3.7/site-packages/torch/autograd/init.py", line 149, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered

======================================================================
ERROR: test_forward_and_backward (test_octree_deconv.OctreeDeconvTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_deconv.py", line 91, in test_forward_and_backward
self.forward_and_backward(kernel_size[j], stride[i])
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_deconv.py", line 31, in forward_and_backward
octree = octree.cuda()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_decode_encode_key (test_octree_key.OctreeKeyTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_key.py", line 10, in test_decode_encode_key
octree = ocnn.octree_batch(samples).cuda()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_search_key (test_octree_key.OctreeKeyTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_key.py", line 33, in test_search_key
octree = ocnn.octree_batch(samples).cuda()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_xyz_key (test_octree_key.OctreeKeyTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_key.py", line 25, in test_xyz_key
octree = ocnn.octree_batch(samples).cuda()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_xyz_key_64 (test_octree_key.OctreeKeyTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_key.py", line 47, in test_xyz_key_64
xyz = torch.cuda.ShortTensor([[2049, 4095, 8011, 1], [511, 4095, 8011, 0]])
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_forward_and_backward_avg_pool (test_octree_pool.OctreePoolTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_pool.py", line 91, in test_forward_and_backward_avg_pool
octree = octree.to('cuda')
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_forward_and_backward_max_pool (test_octree_pool.OctreePoolTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_pool.py", line 28, in test_forward_and_backward_max_pool
octree = octree.to('cuda')
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_forward_and_backward_max_unpool (test_octree_pool.OctreePoolTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_pool.py", line 59, in test_forward_and_backward_max_unpool
octree = octree.to('cuda')
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_octree_property (test_octree_property.OctreePropertyTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_property.py", line 60, in test_octree_property
self.octree_property(on_cuda=True)
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_property.py", line 13, in octree_property
octree = octree.cuda()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

======================================================================
ERROR: test_forward1 (test_octree_trilinear.OctreeTrilinearTest)

Traceback (most recent call last):
File "/media/yangtian/SATA3/Workspace/O-CNN-master/pytorch/test/test_octree_trilinear.py", line 11, in test_forward1
octree = ocnn.octree_batch(ocnn.octree_samples(['octree_1', 'octree_1'])).cuda()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.


Ran 19 tests in 2.602s

FAILED (errors=11)

from o-cnn.

wang-ps avatar wang-ps commented on June 1, 2024

Thanks for your interest in our project.

This is a known issue and I also encountered it before. The error is indeed caused by the version of cublas, please try to build the code with CUDA 10.1 or 10.2.
Currently, I have no bandwidth to fix this issue. If you are interested, please help to fix it.

For your reference, the code is tested with the following pytorch versions:

conda install pytorch==1.6.0 torchvision==0.7.0  cudatoolkit=10.1 -c pytorch
conda install pytorch==1.7.0 torchvision==0.8.0  cudatoolkit=10.2 -c pytorch
conda install pytorch==1.7.1 torchvision==0.8.2  cudatoolkit=10.1 -c pytorch
conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=10.2 -c pytorch
conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=10.2 -c pytorch
docker pull pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel
docker pull pytorch/pytorch:1.8.1-cuda10.2-cudnn7-devel
docker pull pytorch/pytorch:1.9.0-cuda10.2-cudnn7-devel

And the unit test failed with the following pytorch versions in my own experiments:

conda intall pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch
docker pull pytorch/pytorch:1.7.0-cuda11.0-cudnn8-devel
docker pull pytorch/pytorch:1.7.1-cuda11.0-cudnn8-devel

from o-cnn.

sylyt62 avatar sylyt62 commented on June 1, 2024

Appreciate your info :) I'll dig into it a little bit when free.

from o-cnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.