Comments (10)
It looks like a have a temporary problem with CUDA on this machine. Will reinstall the driver and try again.
But here are three points to consider for now to make the program more robust to such failures:
-
If the benchmark can only be executed with a CUDA version of TF, the CPU versions shouldn't be shown. If so, this should be easy to add such a restriction (will help here).
-
The benchmark should be updated, as otherwise it can stop working any moment:
WARNING:tensorflow:From /home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py:204:
initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Use `tf.global_variables_initializer` instead.
(By the way, today is the 8th of March - The International Women's Day:).)
- There's probably nothing we can do when downloading prebuilt libraries, but when we compile from the sources we should enable vector instructions depending on the target CPU support:
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
from ck-tensorflow.
With the CUDA driver back into action, the benchmark fails with a different error:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0)
2017-03-08 13:05:59.217026: step 10, duration = 0.007
2017-03-08 13:05:59.284678: step 20, duration = 0.007
2017-03-08 13:05:59.311790: Forward across 25 steps, 0.006 +/- 0.001 sec / batch
Traceback (most recent call last):
File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 241, in <module>
tf.app.run()
File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 237, in main
run_benchmark()
File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 226, in run_benchmark
objective = loss(last_layer, labels)
File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 105, in loss
concated = tf.concat(1, [indices, labels])
File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/ops/array_ops.py", line 1048, in concat
).assert_is_compatible_with(tensor_shape.scalar())
File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/tensor_shape.py", line 756, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (2, 32, 1) and () are incompatible
Execution time: 86.313 sec.
Any ideas?
from ck-tensorflow.
@psyhtest, after 5616900 don't have such mistake with tf 1.0.0+. Please, check
from ck-tensorflow.
@fanranGit, thanks!
After your update, benchmark-overfeat
behaves similarly to benchmark-googlenet
(issue #4):
- fails on the CPU due to:
InvalidArgumentError (see above for traceback): CPU BiasOp only supports NHWC.
[[Node: conv1/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/Conv2D, conv1/biases/read)]]
- works on the GPU but only if launched with
LD_LIBRARY_PATH
pointing to the CUDA RT and cuDNN:
$ ck run program:tensorflow --env.LD_LIBRARY_PATH=/usr/local/cuda-8.0.61/lib64:/usr/local/cudnn-5.1/lib64
...
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:02:00.0
Total memory: 7.92GiB
Free memory: 7.81GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0)
2017-03-16 10:07:43.316209: step 10, duration = 0.006
2017-03-16 10:07:43.380252: step 20, duration = 0.006
2017-03-16 10:07:43.405881: Forward across 25 steps, 0.006 +/- 0.001 sec / batch
2017-03-16 10:07:44.139324: step 10, duration = 0.019
2017-03-16 10:07:44.329957: step 20, duration = 0.019
2017-03-16 10:07:44.405089: Forward-backward across 25 steps, 0.018 +/- 0.004 sec / batch
Execution time: 5.079 sec.
from ck-tensorflow.
The changes that resolved issue #4 have also resolved this one.
I've opened a new issue #8 to build TF with CPU vector instruction support.
from ck-tensorflow.
Can Some one tell me Why I am getting this error
" File "C:\Users\Prabir Sinha\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 756, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (2, 1) and () are incompatible.
Please help I am stuck here.
from ck-tensorflow.
What is the fix for this issue ---
File "C:\Users\Prabir Sinha\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1001, in concat
).assert_is_compatible_with(tensor_shape.scalar())
File "C:\Users\Prabir Sinha\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 756, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (2, 1) and () are incompatible
Please provide some hint to fix this problem
from ck-tensorflow.
@prabirsinha Care to explain what are you trying to do, on which system and with which version of TensorFlow?
from ck-tensorflow.
from ck-tensorflow.
@prabirsinha This is a repository for CK-TensorFlow, that is a repository for managing TensorFlow with the Collective Knowledge framework for AI/SW/HW co-design and optimisation.
I'm afraid we cannot provide any guidance regarding TensorFlow, as as we are not its developers. However, if you have any questions about Collective Knowledge, we will be happy to help.
from ck-tensorflow.
Related Issues (20)
- TFLite 1.13.1 installation fails on Ubuntu 18.04 HOT 4
- Building TFLite 1.13.1 fails due to dlsym linking error HOT 3
- Compiling program:image-classification-tflite for Android with Clang fails HOT 3
- Building package:lib-tensorflow-1.x.y-src-static for Android fails for x > 10 HOT 9
- HTTPError: 404 Client Error: Not Found for tensorflow-1.12.0-cp37-cp37m-linux_x86_64.whl HOT 3
- Building package:lib-tensorflow-1.11.0-src-cuda-xla fails. HOT 5
- Create package:lib-tflite HOT 1
- Rename ck-tensorflow:classification-* programs HOT 8
- scipy, numpy and other modules in pre/post processing scripts
- ck-tensorflow/program/image-classification-tf-py/classify.py is not executable. HOT 4
- Patch TensorFlow_CC 1.8.0 similar to TensorFlow 1.8.0 for aarch64 platforms HOT 4
- TensorFlow package with support of OpenCL (SYCL) HOT 10
- image-classification-tf-py imports natively installed TensorFlow module instead of presented in package HOT 7
- TensorFlow dependencies HOT 9
- Prebuilt TensorFlow requires system protobuf HOT 1
- tf lite armeabi version is much slower than the nightly aar HOT 2
- TF-Lite GPU benchmark results? HOT 16
- There is no object-detection for TFLite HOT 2
- Tensorflow 1.10.1 issue with python2 HOT 6
- Create Docker image with stable CK for CK+TF+MLPerf HOT 20
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ck-tensorflow.