Gpu API call (invalid device function) in call about dense_flow HOT 18 CLOSED

wanglimin commented on September 23, 2024

Gpu API call (invalid device function) in call

from dense_flow.

Comments (18)

wanglimin commented on September 23, 2024

I did encounter this issue before either. It seems that there is problem with your GPU architecture and driver. I found a similar problem report in the following website:

https://bitbucket.org/rodrigob/doppia/issues/85/opencv-error-gpu-api-call-invalid-device

Perhaps, you could try this.

from dense_flow.

geekvc commented on September 23, 2024

thank you very much! I tried the code and got this:

CUDA Device Query...
There are 4 CUDA devices.

CUDA Device #0
Major revision number:         3
Minor revision number:         5
Name:                          Tesla K40c
Total global memory:           4294770688
Total shared memory per block: 49152
Total registers per block:     65536
Warp size:                     32
Maximum memory pitch:          2147483647
Maximum threads per block:     1024
Maximum dimension 0 of block:  1024
Maximum dimension 1 of block:  1024
Maximum dimension 2 of block:  64
Maximum dimension 0 of grid:   2147483647
Maximum dimension 1 of grid:   65535
Maximum dimension 2 of grid:   65535
Clock rate:                    875500
Total constant memory:         65536
Texture alignment:             512
Concurrent copy and execution: Yes
Number of multiprocessors:     15
Kernel execution timeout:      No

CUDA Device #1
Major revision number:         3
Minor revision number:         5
Name:                          Tesla K40c
Total global memory:           4294770688
Total shared memory per block: 49152
Total registers per block:     65536
Warp size:                     32
Maximum memory pitch:          2147483647
Maximum threads per block:     1024
Maximum dimension 0 of block:  1024
Maximum dimension 1 of block:  1024
Maximum dimension 2 of block:  64
Maximum dimension 0 of grid:   2147483647
Maximum dimension 1 of grid:   65535
Maximum dimension 2 of grid:   65535
Clock rate:                    875500
Total constant memory:         65536
Texture alignment:             512
Concurrent copy and execution: Yes
Number of multiprocessors:     15
Kernel execution timeout:      No

CUDA Device #2
Major revision number:         3
Minor revision number:         5
Name:                          Tesla K40c
Total global memory:           4294770688
Total shared memory per block: 49152
Total registers per block:     65536
Warp size:                     32
Maximum memory pitch:          2147483647
Maximum threads per block:     1024
Maximum dimension 0 of block:  1024
Maximum dimension 1 of block:  1024
Maximum dimension 2 of block:  64
Maximum dimension 0 of grid:   2147483647
Maximum dimension 1 of grid:   65535
Maximum dimension 2 of grid:   65535
Clock rate:                    875500
Total constant memory:         65536
Texture alignment:             512
Concurrent copy and execution: Yes
Number of multiprocessors:     15
Kernel execution timeout:      No

CUDA Device #3
Major revision number:         3
Minor revision number:         5
Name:                          Tesla K40c
Total global memory:           4294770688
Total shared memory per block: 49152
Total registers per block:     65536
Warp size:                     32
Maximum memory pitch:          2147483647
Maximum threads per block:     1024
Maximum dimension 0 of block:  1024
Maximum dimension 1 of block:  1024
Maximum dimension 2 of block:  64
Maximum dimension 0 of grid:   2147483647
Maximum dimension 1 of grid:   65535
Maximum dimension 2 of grid:   65535
Clock rate:                    875500
Total constant memory:         65536
Texture alignment:             512
Concurrent copy and execution: Yes
Number of multiprocessors:     15
Kernel execution timeout:      No

and set at the CMakeLists.txt:
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_35 -code sm_35)
and
make clean
make
error still the same as above.

from dense_flow.

wanglimin commented on September 23, 2024

Maybe you could find other solutions on Google. I did not encounter this problem before.

from dense_flow.

geekvc commented on September 23, 2024

Thank you all the same!
I am trying it on other type GPU.

from dense_flow.

geekvc commented on September 23, 2024

On the Tesla K20, the error disappeared. Thank you.

from dense_flow.

KnightOfTheMoonlight commented on September 23, 2024

I have solved this problem by reinstalling cuda with xxx.deb file.

from dense_flow.

fanser commented on September 23, 2024

@KnightOfTheMoonlight
I also meet the same issue.
OpenCV Error: Gpu API call (invalid device function) in call, file /home/fzy/install/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp, line 361
terminate called after throwing an instance of 'cv::Exception'
what(): /home/fzy/install/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp:361: error: (-217) invalid device function in function call
So how to solve the problem without changing the GPU.
I use CUDA 8.0 + opencv 2.4.10

from dense_flow.

fanser commented on September 23, 2024

I figure this problem like @geekvc saying.
Add this line to CMakeList.txt file
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_52 -code sm_52)
because my compute arch is 52.
then delete the make files by runing
rm -r ./build
to ensure no cmake cache file exist ( @geekvc didn't work, maybe he don't delete all cmake cache file)
make
sudo make install
then it works!

from dense_flow.

shamoqianting commented on September 23, 2024

I encounter the similar problem.
OpenCV Error: Gpu API call (unknown error) in mallocPitch, file /data1/temporal-segment-networks-master/3rd-party/opencv-2.4.13/modules/dynamicuda/include/opencv2/dynamicuda/dynamicuda.hpp, line 1134 terminate called after throwing an instance of 'cv::Exception' what(): /data1/temporal-segment-networks-master/3rd-party/opencv-2.4.13/modules/dynamicuda/include/opencv2/dynamicuda/dynamicuda.hpp:1134: error: (-217) unknown error in function mallocPitch

Though I delete the build folder and add the line
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_37 -code sm_37)
to CMakeList.txt file, the problem is still there.

Does anyone solve this without reinstalling cuda?

from dense_flow.

shamoqianting commented on September 23, 2024

@KnightOfTheMoonlight what deb file do you use ? could you describe more details ? Thank you very much.

from dense_flow.

pengxiaoxiao commented on September 23, 2024

I use the tool to get test image from my test.avi, and follow the usage

./denseFlow_gpu -f test.avi -x tmp/flow_x -y tmp/flow_x -i tmp/image -b 20 -t 1 -d 0 -s 1

and get the error

OpenCV Error: Gpu API call (invalid device function) in call, file /home/uuz/Downloads/opencv/Install-OpenCV-master/Ubuntu/2.4/OpenCV/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp, line 361
terminate called after throwing an instance of 'cv::Exception'
  what():  /home/uuz/Downloads/opencv/Install-OpenCV-master/Ubuntu/2.4/OpenCV/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp:361: error: (-217) invalid device function in function call

[1]    16872 abort (core dumped)  ./denseFlow_gpu -f test.avi -x tmp/flow_x -y tmp/flow_x -i tmp/image -b 20 -t

after that I add

set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_35 -code sm_35)

to the CmakeLists.txt and make again, the error is the same as above, I do not know how to solve it.
Thank you in advance!

I have met the same program, how to solve it
my device is:GeForce GTX 1080 Ti/PCIe/SSE2
cuda 9.0

from dense_flow.

pengxiaoxiao commented on September 23, 2024

I code in makeList "set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_61 -code sm_61)"

from dense_flow.

sucaohan commented on September 23, 2024

@pengxiaoxiao 请问您解决了么，我现在也遇到了这个问题阿，cuda9版本

from dense_flow.

pengxiaoxiao commented on September 23, 2024

重装cuda8 发自我的iPhone

…

------------------ Original ------------------ From: sucaohan <[email protected]> Date: Fri,Jun 7,2019 1:03 PM To: wanglimin/dense_flow <[email protected]> Cc: shawxiao <[email protected]>, Mention <[email protected]> Subject: Re: [wanglimin/dense_flow] Gpu API call (invalid device function) in call (#6)

from dense_flow.

sucaohan commented on September 23, 2024

@pengxiaoxiao 但是一请问个ubuntu系统可以装两个cuda么，之前电脑装了很多东西，cuda9不让卸载

from dense_flow.

pengxiaoxiao commented on September 23, 2024

不行吧！发自我的iPhone

…

------------------ Original ------------------ From: sucaohan <[email protected]> Date: Fri,Jun 7,2019 1:06 PM To: wanglimin/dense_flow <[email protected]> Cc: shawxiao <[email protected]>, Mention <[email protected]> Subject: Re: [wanglimin/dense_flow] Gpu API call (invalid device function) in call (#6) @pengxiaoxiao 但是一请问个ubuntu系统可以装两个cuda么，之前电脑装了很多东西，cuda9不让卸载 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

from dense_flow.

sucaohan commented on September 23, 2024

@pengxiaoxiao 好的呢，谢谢

from dense_flow.

ZWJ-here commented on September 23, 2024

计算力不匹配的问题
未设置前opencv在cmake的输出是这样的：计算力是30 35 37
NVIDIA CUDA
Use CUFFT: YES
Use CUBLAS: YES
USE NVCUVID：NO
NVIDIA GPU arch: 30 35 37
NVIDIA PTX archs:
Use fast math:NO
#Note:6.1为GTX1080的计算能力,不同显卡需要根据自己的计算能力进行修改
#查询显卡计算能力，可以通过运行cuda samples中的deviceQuery得知。
#（文件夹NVIDIA_CUDA-*_Samples下编译示例， *为版本号）
如果设置成功,cmake界面会有如下显示（我的显卡是1080ti）：
NVIDIA CUDA
Use CUFFT: YES
Use CUBLAS: YES
USE NVCUVID：NO
NVIDIA GPU arch: 61
NVIDIA PTX archs:61
Use fast math:NO

GPU arch/PTX archs都被设置为6.1
但如果运气不佳，添加编译选项并不能解决问题。
这时候需要修改opencv中关于CUDA计算能力这部分的配置文件./cmake/OpenCVDetectCUDA.cmake。
在
set(CUDA_ARCH_BIN ${__cuda_arch_bin} CACHE STRING "Specify 'real' GPU architectures to build binaries for, BIN(PTX) format is supported")
set(CUDA_ARCH_PTX ${__cuda_arch_ptx} CACHE STRING "Specify 'virtual' PTX architectures to build PTX intermediate code for")
之前添加
set(__cuda_arch_bin "6.1")
set(__cuda_arch_ptx "6.1")
保存后cmake上面那一段，重新将opencv cmake make make install一遍出现正确的计算能力显示61

最后重新编译dense_flow

from dense_flow.

Gpu API call (invalid device function) in call about dense_flow HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent