Coder Social home page Coder Social logo

Bus error about redner HOT 4 CLOSED

bachili avatar bachili commented on July 22, 2024
Bus error

from redner.

Comments (4)

BachiLi avatar BachiLi commented on July 22, 2024

Can you backtrace and show the stacktrace?

from redner.

mguillau avatar mguillau commented on July 22, 2024

Here's the output of backtrace:

(gdb) run test_shadow_light.py
Starting program: /home/ubuntu/miniconda3/bin/python test_shadow_light.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffa2ae4700 (LWP 1132)]
[New Thread 0x7fff894f7700 (LWP 1133)]
Scene construction, time: 0.07212 s
[New Thread 0x7fff88bf6700 (LWP 1134)]
[New Thread 0x7fff82818700 (LWP 1135)]
[New Thread 0x7fff80d02700 (LWP 1136)]

Thread 1 "python" received signal SIGBUS, Bus error.
ChannelInfo::ChannelInfo (this=0x7fffffffb890, channels=..., use_gpu=<optimized out>) at /home/ubuntu/src/redner/channels.cpp:25
25              this->channels[i] = channels[i];
(gdb) bt
#0  ChannelInfo::ChannelInfo (this=0x7fffffffb890, channels=..., use_gpu=<optimized out>) at /home/ubuntu/src/redner/channels.cpp:25
#1  0x00007fffa19f6a81 in render (scene=..., options=..., rendered_image=..., d_rendered_image=..., d_scene=..., debug_image=...)
    at /home/ubuntu/src/redner/pathtracer.cpp:390
#2  0x00007fffa1976e88 in pybind11::detail::argument_loader<Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float> >::call_impl<void, void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, pybind11::detail::void_type>(void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul>, pybind11::detail::void_type&&) (f=<optimized out>, this=0x7fffffffcd70)
    at /home/ubuntu/miniconda3/include/python3.7m/pybind11/cast.h:1874
#3  pybind11::detail::argument_loader<Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float> >::call<void, pybind11::detail::void_type, void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>)>(void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>)) && (f=<optimized out>, this=<optimized out>)
    at /home/ubuntu/miniconda3/include/python3.7m/pybind11/cast.h:1856
#4  void pybind11::cpp_function::initialize<void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), void, Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>, pybind11::name, pybind11::scope, pybind11::sibling, char [1]>(void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), void (*)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [1])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (call=..., 
    __closure=0x0) at /home/ubuntu/miniconda3/include/python3.7m/pybind11/pybind11.h:154
#5  void pybind11::cpp_function::initialize<void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), void, Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>, pybind11::name, pybind11::scope, pybind11::sibling, char [1]>(void (*&)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), void (*)(Scene const&, RenderOptions const&, ptr<float>, ptr<float>, std::shared_ptr<DScene>, ptr<float>), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [1])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) ()
    at /home/ubuntu/miniconda3/include/python3.7m/pybind11/pybind11.h:132
#6  0x00007fffa193bfcc in pybind11::cpp_function::dispatcher (self=<optimized out>, args_in=0x7ffff6d7ed08, kwargs_in=0x0)
    at /home/ubuntu/miniconda3/include/python3.7m/pybind11/pybind11.h:627
#7  0x00005555556cd6e4 in _PyMethodDef_RawFastCallKeywords () at /tmp/build/80754af9/python_1553721932202/work/Objects/call.c:690
#8  0x00005555556cd801 in _PyCFunction_FastCallKeywords (func=0x7fffa3ece750, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>)
    at /tmp/build/80754af9/python_1553721932202/work/Objects/call.c:730
#9  0x00005555557292bc in call_function (kwnames=0x0, oparg=6, pp_stack=<synthetic pointer>)
    at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:4568
#10 _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:3093
#11 0x000055555566a4f9 in _PyEval_EvalCodeWithName () at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:3930
#12 0x000055555566b5d5 in _PyFunction_FastCallDict () at /tmp/build/80754af9/python_1553721932202/work/Objects/call.c:376
#13 0x00007fffe8cc9ce9 in THPFunction_apply(_object*, _object*) ()
   from /home/ubuntu/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#14 0x0000555555690be7 in cfunction_call_varargs (kwargs=<optimized out>, args=<optimized out>, func=0x7fff8a2f41b0)
    at /tmp/build/80754af9/python_1553721932202/work/Objects/call.c:768
#15 PyCFunction_Call () at /tmp/build/80754af9/python_1553721932202/work/Objects/call.c:784
#16 0x000055555572a151 in do_call_core (kwdict=0x0, callargs=0x555557e38468, func=0x7fff8a2f41b0)
    at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:4641
#17 _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:3191
#18 0x000055555566a4f9 in _PyEval_EvalCodeWithName () at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:3930
#19 0x000055555566b3c4 in PyEval_EvalCodeEx () at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:3959
#20 0x000055555566b3ec in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>)
    at /tmp/build/80754af9/python_1553721932202/work/Python/ceval.c:524
#21 0x0000555555783874 in run_mod () at /tmp/build/80754af9/python_1553721932202/work/Python/pythonrun.c:1035
#22 0x000055555578db81 in PyRun_FileExFlags () at /tmp/build/80754af9/python_1553721932202/work/Python/pythonrun.c:988
#23 0x000055555578dd73 in PyRun_SimpleFileExFlags () at /tmp/build/80754af9/python_1553721932202/work/Python/pythonrun.c:429
#24 0x000055555578ee5f in pymain_run_file (p_cf=0x7fffffffd9e0, filename=0x5555558c63e0 L"test_shadow_light.py", fp=0x555555948360)
    at /tmp/build/80754af9/python_1553721932202/work/Modules/main.c:427
#25 pymain_run_filename (cf=0x7fffffffd9e0, pymain=0x7fffffffdaf0) at /tmp/build/80754af9/python_1553721932202/work/Modules/main.c:1627
#26 pymain_run_python (pymain=0x7fffffffdaf0) at /tmp/build/80754af9/python_1553721932202/work/Modules/main.c:2877
#27 pymain_main () at /tmp/build/80754af9/python_1553721932202/work/Modules/main.c:3038
#28 0x000055555578ef7c in _Py_UnixMain () at /tmp/build/80754af9/python_1553721932202/work/Modules/main.c:3073
#29 0x00007ffff7810830 in __libc_start_main (main=0x55555564aed0 <main>, argc=2, argv=0x7fffffffdc48, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7fffffffdc38) at ../csu/libc-start.c:291
#30 0x0000555555734122 in _start () at ../sysdeps/x86_64/elf/start.S:103

Then I tried to set CUDA_LAUNCH_BLOCKING=1 and this actually circumvents the issue. Is that an acceptable solution or does it come with compromises (e.g. performance)?

from redner.

BachiLi avatar BachiLi commented on July 22, 2024

It's indeed a synchronization issue. Most likely we access a unified memory on CPU while another GPU kernel is executing. This only results in segmentation fault/bus error in pre-Pascal devices so I didn't notice this. I pushed a fix, does the latest commit fix your problem?

Using CUDA_LAUNCH_BLOCKING=1 indeed compromises performance since redner launches a lot of kernels during rendering. It is good for debugging though.

from redner.

mguillau avatar mguillau commented on July 22, 2024

Yes, it works. Thanks for the swift fix!

from redner.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.