Comments (8)
Hi @lhoangan , can you tell me what system configurations you are using? (i.e., python version, GPU, etc.) Also, did you modify the input argument?
from df-net.
Hi Yuliang-Zou, thanks for the quick reply. I'm using Linux Mint 17.2, cuda-8, 1 TitanX and I installed tensorflow with this command:
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.0-cp27-none-linux_x86_64.whl
I have changed the num_gpu from 4 to 1
from df-net.
Hi @lhoangan , seems that you use python2? Might you try python3 instead? Also, I used the code on Ubuntu so I am not sure if the OS matters.
BTW, you might also need to change batch size to 1, otherwise, it cannot fit into the GPU.
from df-net.
Yes, I can try with it. Is there any preferable version of python3 like 3.4 or 3.5?
BTW, it seems that the PIL package requires python2.7, I thought I got an error of not having that PIL package before.
from df-net.
I used Python3.6. I think you can install pypng
with pip and it should be fine.
from df-net.
I've changed to Python3.6
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.0-cp36-cp36m-linux_x86_64.whl
but the problem still persists
Here's the full error message, in case it helps
{'alpha_image_loss': 0.85,
'batch_size': 1,
'beta1': 0.9,
'checkpoint_dir': './checkpoint',
'ckpt_dp': 'pretrained/cs_5frame_pre',
'ckpt_flow': 'pretrained/unflowc_pre',
'ckpt_pose': None,
'continue_train': True,
'cross_consistency': 0.5,
'dataset_dir': 'raw',
'depth_consistency': 0.2,
'fix_pose': False,
'flow_consistency': 0.2,
'flow_smooth_weight': 3.0,
'img_height': 320,
'img_width': 1152,
'learning_rate': 0.0001,
'max_steps': 100000,
'num_gpus': 1,
'save_latest_freq': 5000,
'scale_normalize': False,
'seq_length': 5,
'smooth_weight': 3.0,
'summary_freq': 100}
WARNING:tensorflow:From /home/hale/TrimBot/projects/DF-Net/core/DFLearner.py:475: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
2018-09-16 00:02:46.514853: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-09-16 00:02:46.514890: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-09-16 00:02:46.514899: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-09-16 00:02:46.514907: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-09-16 00:02:46.514915: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-09-16 00:02:46.928058: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:03:00.0
Total memory: 11.91GiB
Free memory: 10.53GiB
2018-09-16 00:02:46.928149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2018-09-16 00:02:46.928167: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2018-09-16 00:02:46.928188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
Traceback (most recent call last):
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1117, in _run_fn
self._extend_graph()
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1166, in _extend_graph
self._session, graph_def.SerializeToString(), status)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/contextlib.py", line 88, in __exit__
next(self.gen)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'Correlation' with these attrs. Registered devices: [CPU,GPU], Registered kernels:
<no registered kernels>
[[Node: flow_prediction/flownet_c_3/Correlation_1 = Correlation[kernel_size=1, max_displacement=20, pad=20, stride_1=1, stride_2=2, _device="/device:GPU:0"](flow_prediction/flownet_c_features_3/conv3_1/leaky_relu/Maximum, flow_prediction/flownet_c_features_3/conv3/leaky_relu/Maximum)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train_df.py", line 53, in <module>
tf.app.run()
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train_df.py", line 50, in main
learner.train(FLAGS)
File "/home/hale/TrimBot/projects/DF-Net/core/DFLearner.py", line 489, in train
with sv.managed_session(config=config) as sess:
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 953, in managed_session
start_standard_services=start_standard_services)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 708, in prepare_or_wait_for_session
init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 279, in prepare_session
sess.run(init_op, feed_dict=init_feed_dict)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'Correlation' with these attrs. Registered devices: [CPU,GPU], Registered kernels:
<no registered kernels>
[[Node: flow_prediction/flownet_c_3/Correlation_1 = Correlation[kernel_size=1, max_displacement=20, pad=20, stride_1=1, stride_2=2, _device="/device:GPU:0"](flow_prediction/flownet_c_features_3/conv3_1/leaky_relu/Maximum, flow_prediction/flownet_c_features_3/conv3/leaky_relu/Maximum)]]
Caused by op 'flow_prediction/flownet_c_3/Correlation_1', defined at:
File "train_df.py", line 53, in <module>
tf.app.run()
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train_df.py", line 50, in main
learner.train(FLAGS)
File "/home/hale/TrimBot/projects/DF-Net/core/DFLearner.py", line 467, in train
self.build_train_graph()
File "/home/hale/TrimBot/projects/DF-Net/core/DFLearner.py", line 81, in build_train_graph
losses, grads = self.single_tower_operation(optim, tgt_image_dp_splits[i], src_image_stack_dp_splits[i], tgt_image_flow_splits[i], src_image_stack_flow_splits[i], tgt_image_splits[i], src_image_stack_splits[i], intrinsics_splits[i], model_idx=i)
File "/home/hale/TrimBot/projects/DF-Net/core/DFLearner.py", line 148, in single_tower_operation
tgt2src, src2tgt = flownet(tgt_image_flow, src_image_stack_flow[:,:,:,3*i:3*(i+1)], flownet_spec='C', backward_flow=True, reuse=reuse_variables)
File "/home/hale/TrimBot/projects/DF-Net/core/UnFlow/src/e2eflow/core/flownet.py", line 86, in flownet
scoped_block(reuse)
File "/home/hale/TrimBot/projects/DF-Net/core/UnFlow/src/e2eflow/core/flownet.py", line 49, in scoped_block
channel_mult=channel_mult)
File "/home/hale/TrimBot/projects/DF-Net/core/UnFlow/src/e2eflow/core/flownet.py", line 231, in flownet_c
pad=20, kernel_size=1, max_displacement=20, stride_1=1, stride_2=2)
File "/home/hale/TrimBot/projects/DF-Net/core/UnFlow/src/e2eflow/ops.py", line 64, in correlation
return _correlation_module.correlation(first, second, **kwargs)[0]
File "<string>", line 49, in correlation
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/hale/anaconda2/envs/tf1.2p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'Correlation' with these attrs. Registered devices: [CPU,GPU], Registered kernels:
<no registered kernels>
[[Node: flow_prediction/flownet_c_3/Correlation_1 = Correlation[kernel_size=1, max_displacement=20, pad=20, stride_1=1, stride_2=2, _device="/device:GPU:0"](flow_prediction/flownet_c_features_3/conv3_1/leaky_relu/Maximum, flow_prediction/flownet_c_features_3/conv3/leaky_relu/Maximum)]]
from df-net.
Try to modify this line, change the machine code from sm_30
to sm_52
? Then delete all the previous generated .so files and re-run the code.
If possible, please also display the message for the compilation.
from df-net.
Hi @Yuliang-Zou, I found the error, I had tried to remove -D GOOGLE_CUDA=1
from the same line before, trying to solve another error, but it seemed to be the wrong call. I put it back, and it passes that error now (even with sm_30
or python2). It's my bad, sorry for wasting your time.
from df-net.
Related Issues (20)
- Pretraining the network HOT 15
- error with test_flownet_2012.py/test_flownet_2015.py HOT 2
- train ERROR HOT 4
- Question about Flow_SCALE during testing HOT 4
- version `CXXABI_1.3.9' not found HOT 1
- flownet produces non-deterministic output HOT 1
- How to cacluate the EPE for the optical flow 2012/2015 test sets? HOT 4
- Testing on Make3D/ Training on cityscape HOT 4
- saving depth images HOT 2
- Result visualization HOT 2
- Test pose code HOT 6
- Do you have the generated ground truth depth map of Kitti? HOT 1
- The shuffle operation in the data loading process. HOT 3
- models
- Questions about pre-trained loss and supplementary material
- Saving Flow and Pose models separately during training HOT 1
- About the depth consistency loss HOT 8
- Kitti 2015 optical flow results? HOT 6
- compilation error HOT 4
- Training with multiple batches on the same GPU HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from df-net.