Coder Social home page Coder Social logo

train ERROR about df-net HOT 4 CLOSED

boyob avatar boyob commented on May 23, 2024
train ERROR

from df-net.

Comments (4)

Yuliang-Zou avatar Yuliang-Zou commented on May 23, 2024

Seems that you did not specify your CUDA path. If you try which nvcc, it should output nothing in your case.

from df-net.

ReekiLee avatar ReekiLee commented on May 23, 2024

hi @boyob @Yuliang-Zou
Have you solved this error?
I meet a similar problem now. When I run python test_kitti_depth.py --dataset_dir=./dataset --output_dir=./prediction --ckpt_file=./pretrained --split="test"
I got an error like this:

backward_warp_op.cu.cc:5:54: fatal error: tensorflow/core/framework/register_types.h: No such file or directory
compilation terminated.
Traceback (most recent call last):
File "/root/DF-Net/core/UnFlow/src/e2eflow/ops.py", line 53, in
op_lib = tf.load_op_library(lib_path)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/tensorflow_core/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./backward_warp_op.so: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test_kitti_depth.py", line 6, in
from core import DFLearner
File "/root/DF-Net/core/init.py", line 2, in
from .DFLearner import DFLearner
File "/root/DF-Net/core/DFLearner.py", line 14, in
from .UnFlow import flownet
File "/root/DF-Net/core/UnFlow/init.py", line 1, in
from .src import flownet
File "/root/DF-Net/core/UnFlow/src/init.py", line 1, in
from .e2eflow import flownet
File "/root/DF-Net/core/UnFlow/src/e2eflow/init.py", line 1, in
from .core import flownet
File "/root/DF-Net/core/UnFlow/src/e2eflow/core/init.py", line 1, in
from .flownet import flownet
File "/root/DF-Net/core/UnFlow/src/e2eflow/core/flownet.py", line 5, in
from ..ops import correlation
File "/root/DF-Net/core/UnFlow/src/e2eflow/ops.py", line 55, in
compile(n)
File "/root/DF-Net/core/UnFlow/src/e2eflow/ops.py", line 35, in compile
subprocess.check_output(nvcc_cmd, shell=True)
File "/usr/local/python3.7.5/lib/python3.7/subprocess.py", line 411, in check_output
**kwargs).stdout
File "/usr/local/python3.7.5/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'nvcc -std=c++11 -c -gencode=arch=compute_30,code=sm_30 -o backward_warp_op.cu.o backward_warp_op.cu.cc -I /usr/local/cuda-10.0/include -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC' returned non-zero exit status 1.

My enviroment is TF1.15.0+cuda10.0+g++5.4.0
Here is Line31~41 of my ops.py:

    cuda_lib64_path_arg = "-L /usr/local/cuda-10.0/lib64"
    nvcc_cmd = "nvcc -std=c++11 -c -gencode=arch=compute_30,code=sm_30 -o {} -I /usr/local/cuda-10.0/include -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC"
    nvcc_cmd = nvcc_cmd.format(" ".join([fn_cu_o, fn_cu_cc]),
                               tf_inc)
    subprocess.check_output(nvcc_cmd, shell=True)

    gcc_cmd = "{} -std=c++11 -shared -o {} -I {} -fPIC -lcudart -D GOOGLE_CUDA=1 {}"
    gcc_cmd = gcc_cmd.format('g++ 5.4.0',
                            " ".join([fn_so, fn_cu_o, fn_cc]),
                             tf_inc,
                             cuda_lib64_path_arg)

I've been stucked with this problem for several days and I have to solve it urgently, could you please give me some advice?
Thank you in advance.

from df-net.

Yuliang-Zou avatar Yuliang-Zou commented on May 23, 2024

The code was developed using tf-1.2.0 version. Please switch to the same version to use it.

from df-net.

ReekiLee avatar ReekiLee commented on May 23, 2024

The code was developed using tf-1.2.0 version. Please switch to the same version to use it.

Glad to receive your reply, but I have to train this model using TF1.15+CUDA10 because of my project. This error seems from UnFlow when compiling, so I'm going to run Unflow first.

from df-net.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.