ctuning / ck-tensorflow Goto Github PK

Collective Knowledge components for TensorFlow (code, data sets, models, packages, workflows):

License: BSD 3-Clause "New" or "Revised" License

Shell 8.21% Python 60.24% Makefile 0.01% C++ 28.34% TeX 0.01% Batchfile 1.75% Roff 0.01% Arc 0.43% Slim 0.01% Pascal 1.02%

collective-knowledge automation workflow-automation pipelines package-management apis metadata knowledge-management

ck-tensorflow's Introduction

Collective Knowledge components for TensorFlow

All CK components can be found at cKnowledge.io and in one GitHub repository!

This project is hosted by the cTuning foundation.

Linux/MacOS: Windows:

Introduction

CK-TensorFlow repository provides automation components in the CK format for tedious and repetitive tasks such as detecting and installing different TensorFlow versions, models and data sets across diverse platforms and running AI/ML workflows in a unified way.

Note that if some third-party automation fails or misses some functionality (software detection, package installation, bechmarking and autotuning workflow, etc), the CK concept is to continuously and collaboratively improve such reusable components! Please provide your feedback and report bugs via GitHub issues or get in touch with the community using this public CK mailing list!

Installation

Prerequisites for Ubuntu

Python 2.x:

$ sudo apt-get install python-dev python-pip python-setuptools python-opencv git

Python 3.x:

$ sudo apt-get install python3-dev python3-pip python3-setuptools

Note that CK will automatically install the following dependencies into CK TF virtual space: protobuf easydict joblib image wheel numpy scipy absl-py

Optional dependencies depending on your use cases:

CUDA/cuDNN if you have CUDA-enabled GPU
Android NDK if you want to compile and run TF for Android devices

CK installation

Follow these instructions to install CK.

Installation of ck-tensorflow repository

$ ck pull repo:ck-tensorflow

Basic usage

Example of a unified TensorFlow installation on Ubuntu or Windows via CK (pre-build versions)

$ ck install package:lib-tensorflow-1.8.0-cpu
 and/or (CK enables easy co-existance of different versions of tools
$ ck install package:lib-tensorflow-1.8.0-cuda

Check that TF is installed locally and registered in the CK:

$ ck show env --tags=lib,tensorflow

Use CK virtual environment to test it (similar to Python virtual env but for any binary package installed via CK):

$ ck virtual env --tags=lib,tensorflow

Install other TF versions available in the CK:

$ ck install package --tags=lib,tensorflow

Test unified image classification workflow via CK using above TF

$ ck run program:tensorflow --cmd_key=classify

Note, that you will be asked to select a jpeg image from available CK data sets. We added standard demo images (cat.jpg, catgrey.jpg, fish-bike.jpg, computer_mouse.jpg) to the 'ctuning-datasets-min' repository. You can list them via

 $ ck pull repo:ctuning-datasets-min
 $ ck search dataset --tags=dnn

Customize builds for different platforms

You can find more details about customized TensorFlow builds via CK for Android, Linux, Windows, Raspberry Pi, odroid, etc here.

Benchmarking

 $ ck run program:tensorflow (--env.BATCH_SIZE=10) (--env.NUM_BATCHES=5)

Select one of the test_cpu and test_cuda commands; select an available version of TensorFlow, if prompted (more than one choice); select an available benchmark, if prompted (more than one choice), and select TensorFlow model.

Crowd-benchmarking

It is now possible to participate in crowd-benchmarking of TensorFlow (early prototype):

$ ck crowdbench tensorflow --user={your email or ID to acknowledge contributions} (--env.BATCH_SIZE=128 --env.NUM_BATCHES=100)

You can see continuously aggregated results in the public Collective Knowledge repository under 'crowd-benchmark TensorFlow library' scenario.

Note, that this is an on-going, heavily evolving and long-term project to enable collaborative and systematic benchmarking and tuning of realistic workloads across diverse hardware (ARM TechCon'16 talk and demo, DATE'16, CPC'15). We also plan to add crowd-benchmarking and crowd-tuning of Caffe, TensorFlow and other DNN frameworks to our Android application soon - please, stay tuned!

Unified, multi-dimensional and multi-objective autotuning

It is now possible to take advantage of our universal multi-objective CK autotuner to optimize TensorFlow. As a first simple example, we added batch size tuning via CK. You can invoke it as follows:

$ ck autotune tensorflow

All results will be recorded in the local CK repository and you will be given command lines to plot graphs or replay experiments such as:

$ ck plot graph:{experiment UID}
$ ck replay experiment:{experiment UID} --point={specific optimization point}

Collaborative and unified DNN optimization

We are now working to extend above autotuner and crowdsource optimization of the whole SW/HW/model/data set stack (paper 1, paper 2).

We would like to thank the community for their interest and feedback about this collaborative AI optimization approach powered by CK at ARM TechCon'16 and the Embedded Vision Summit'17 - so please stay tuned ;) !

Using other DNN via unified CK API

CK allows us to unify AI interfaces while collaboratively optimizing underneath engines. For example, we added similar support to install, use and evaluate Caffe/Caffe2, CK-PyTorch and MXNet via CK:

$ ck pull repo:ck-caffe2
$ ck pull repo --url=https://github.com/dividiti/ck-caffe
$ ck pull repo:ck-mxnet

$ ck install package:lib-caffe-bvlc-master-cpu-universal --env.CAFFE_BUILD_PYTHON=ON
$ ck install package:lib-caffe2-master-eigen-cpu-universal --env.CAFFE_BUILD_PYTHON=ON
$ ck install package --tags=mxnet

$ ck run program:caffe --cmd_key=classify
$ ck run program:caffe2 --cmd_key=classify
$ ck run program:mxnet --cmd_key=classify

$ ck crowdbench caffe --env.BATCH_SIZE=5
$ ck crowdbench caffe2 --env.BATCH_SIZE=5 --user=i_want_to_ack_my_contribution

$ ck autotune caffe
$ ck autotune caffe2

Realistic/representative training sets

We provided an option in all our AI crowd-tuning tools to let the community report and share mispredictions (images, correct label and wrong misprediction) to gradually and collaboratively build realistic data/training sets:

Online demo of a unified CK-AI API

Simple demo to classify images with continuous optimization of DNN engines underneath, sharing of mispredictions and creation of a community training set; and to predict compiler optimizations based on program features.

Open R&D challenges

We use crowd-benchmarking and crowd-tuning of such realistic workloads across diverse hardware for open academic and industrial R&D challenges - join this community effort!

Publications

CK publications

Troublesooting

SqueezeDet demo currently work well with Python 3.5 and package:squeezedetmodel-squeezedet, so install it first:

$ ck install package:squeezedetmodel-squeezedet
$ ck run program:squeezedet

Feedback

Get in touch with ck-tensorflow developers via CK mailing list: http://groups.google.com/group/collective-knowledge !

ck-tensorflow's People

Contributors

Stargazers

Watchers

ck-tensorflow's Issues

building bazel failed on

Steps to reproduce

Ubuntu
ck pull repo:ck-tensorflow
ck install package:lib-tensorflow-cpu

Actual result
Got error

*** Installation path used: /home/daniil/CK-TOOLS/tool-bazel-master-linux-64

-----------------------------------
Resolving software dependencies ...

*** Dependency 1 = java-compiler (Java compiler):

    Resolved. CK environment UID = 4bfb0ae868ccbb6e (detected version 1.8.0_101)
-----------------------------------

Cloning from 'https://github.com/bazelbuild/bazel.git' ...
fatal: destination path 'src' already exists and is not an empty directory.

Compiling bazel ...

Building bazel ...
/home/daniil/CK/ck-tensorflow/package/tool-bazel/install.sh: line 38: cd: output: No such file or directory
Error: building bazel failed!
CK error: [package] package installation failed!

Expected result

Installation without errors

No need to upgrade SqueezeDet

During SqueezeDet installation, the TensorFlow upgrade script made the following changes (extracted from the generated report.txt):
1.

'/home/anton/CK_TOOLS/squeezeDet-master-linux-64/squeezeDet/src/nn_skeleton.py' Line 287
--------------------------------------------------------------------------------

Renamed keyword argument from 'reduction_indices' to 'axis'

    Old:               reduction_indices=[1]
                       ~~~~~~~~~~~~~~~~~~
    New:               axis=[1]
                       ~~~~~

This change is probably not strictly necessary as the documentation still lists reduction_indices (albeit as deprecated).

--------------------------------------------------------------------------------
Processing file '/home/anton/CK_TOOLS/squeezeDet-master-linux-64/squeezeDet/src/nets/squeezeDet.py'
 outputting to '/home/anton/CK_TOOLS/squeezeDet-master-linux-64/squeezeDet_updated/src/nets/squeezeDet.py'
--------------------------------------------------------------------------------

'/home/anton/CK_TOOLS/squeezeDet-master-linux-64/squeezeDet/src/nets/squeezeDet.py' Line 105
--------------------------------------------------------------------------------

Added keyword 'concat_dim' to reordered function 'tf.concat'
Added keyword 'values' to reordered function 'tf.concat'

    Old:     return tf.concat([ex1x1, ex3x3], 3, name=layer_name+'/concat')

    New:     return tf.concat(axis=[ex1x1, ex3x3], values=3, name=layer_name+'/concat')
                              ~~~~~                ~~~~~~~

This is the reason why I had to create this issue. The "Old" form is already correct, as this was upgraded in the SqueezeDet repository 3 weeks ago.

--------------------------------------------------------------------------------
Processing file '/home/anton/CK_TOOLS/squeezeDet-master-linux-64/squeezeDet/src/nets/squeezeDetPlus.py'
 outputting to '/home/anton/CK_TOOLS/squeezeDet-master-linux-64/squeezeDet_updated/src/nets/squeezeDetPlus.py'
--------------------------------------------------------------------------------

'/home/anton/CK_TOOLS/squeezeDet-master-linux-64/squeezeDet/src/nets/squeezeDetPlus.py' Line 105
--------------------------------------------------------------------------------

Added keyword 'concat_dim' to reordered function 'tf.concat'
Added keyword 'values' to reordered function 'tf.concat'

    Old:     return tf.concat([ex1x1, ex3x3], 3, name=layer_name+'/concat')

    New:     return tf.concat(axis=[ex1x1, ex3x3], values=3, name=layer_name+'/concat')
                              ~~~~~                ~~~~~~~

Similarly to change 2, this was upgraded three weeks ago.

Therefore, I suggest to abstain from running the upgrade script, unless (until?) change 1 bites.

To prevent this from happening, we should inform @BichenWuUCB ASAP. (Have I just done that?)

Tensorflow 1.10.1 issue with python2

Tensorflow 1.10.1 seems to have a conflict with preexisting python2 packages in CK framework. @psyhtest

$ ck virtual env --tags=lib,tensorflow
$ python -c "import numpy"

failed with error message

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/zzdq0r/CK-TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/numpy/__init__.py", line 142, in <module>
    from . import add_newdocs
  File "/home/zzdq0r/CK-TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/home/zzdq0r/CK-TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/home/zzdq0r/CK-TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/home/zzdq0r/CK-TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/numpy/core/__init__.py", line 26, in <module>
    raise ImportError(msg)
ImportError: 
Importing the multiarray numpy extension module failed.  Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control).  Otherwise reinstall numpy.

Original error was: cannot import name multiarray

The trace seems to suggest in python 2.7 it is using prebuild tensorflow 1.10.1 with python 3.5.

In a clean installation of Tensorflow detection API does not have any issues.

Patch TensorFlow_CC 1.8.0 similar to TensorFlow 1.8.0 for aarch64 platforms

We have a patch to fix an issue with building TensorFlow 1.8.0 on aarch64 platforms. The same should be applied to TensorFlow_CC 1.8.0.

However, simply placing this patch into package/lib-tensorflow_cc-shared-1.8.0/patch.linux doesn't work (which is why I have now removed it).

The existing patches are for TensorFlow_CC itself. However, the new patch should be applied to the TensorFlow sources that get checked out after the initial patches are applied to TensorFlow_CC.

NB: While the original issue was fixed in 1.9.0, another one was introduced. So we need to think of a generic mechanism for TensorFlow_CC patches.

Compiling program:image-classification-tflite for Android with Clang fails

I have a version of TFLite v1.13.1 built for Android:

$ ck install package --tags=lib,tflite,v1.13.1 --target_os=android24-arm64
...
$ ck show env --tags=lib,tflite,target-os-android24-arm64,compiled-by-llvm-android-ndk-3.8.256229
Env UID:         Target OS:      Bits: Name:                                      Version: Tags:

8caa447331219f22 android24-arm64    64 TensorFlow Lite API (from sources, static) 1.13.1   64bits,channel-stable,compiled-by-llvm-android-ndk,compiled-by-llvm-android-ndk-3.8.256229,host-os-linux-64,lib,lite,target-os-android24-arm64,tensorflow,tensorflow-lite,tensorflow-static,tflite,v1,v1.13,v1.13.1,vsrc,vstatic

As indicated by the tags, it's compiled with Clang 3.8 in Android NDK (r13b):

$ ck show env --tags=compiler,lang-cpp,llvm,target-os-android24-arm64,v3.8.256229
Env UID:         Target OS:      Bits: Name:                     Version:   Tags:

9d67d461bb5e069d android24-arm64    64 Android NDK LLVM compiler 3.8.256229 64bits,android,compiler,host-os-linux-64,lang-c,lang-cpp,llvm,ndk,target-os-android24-arm64,v3,v3.8,v3.8.256229

Cross-compiling program:image-classification-tflite for Android fails:

$ ck compile program:image-classification-tflite --target_os=android24-arm64 --deps.compiler=9d67d461bb5e069d
...
clang++ -c  -fPIE -pie -target aarch64-linux-android -gcc-toolchain /home/anton/data/android-ndk-r13b/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64 --sysroot=/home/anton/data/android-ndk-r13b
/platforms/android-24/arch-arm64   -I../ -DCK_HOST_OS_NAME2_LINUX=1 -DCK_HOST_OS_NAME_LINUX=1 -DCK_TARGET_OS_NAME2_ANDROID=1 -DCK_TARGET_OS_NAME_LINUX=1 -std=c++11 -DTF_LITE_1_13 -Wall -Wno-sign-compare -
I/home/anton/data/android-ndk-r13b/toolchains/llvm/prebuilt/linux-x86_64/include -I/home/anton/CK_TOOLS/lib-rtl-xopenme-0.3-android-ndk-4.9.x-android24-arm64/include -I/home/anton/data/android-ndk-r13b/so
urces/cxx-stl/gnu-libstdc++/include -I/home/anton/data/android-ndk-r13b/sources/cxx-stl/gnu-libstdc++/libs/arm64-v8a/include -I/home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-llvm-android-ndk-3.8.256229
-android24-arm64/src -I/home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-llvm-android-ndk-3.8.256229-android24-arm64/src/tensorflow/lite/tools/make/downloads/flatbuffers/include  ../classification.cpp  -o
 classification.o
clang++: warning: argument unused during compilation: '-pie'
In file included from ../classification.cpp:9:
../benchmark.h:14:10: fatal error: 'chrono' file not found
#include <chrono>
         ^
1 error generated.

Inspecting the environment reveals what the problem is:

$ ck cat env --tags=compiler,lang-cpp,llvm,target-os-android24-arm64,v3.8.256229 | grep INCLUDE
export CK_ENV_COMPILER_LLVM_INCLUDE=/home/anton/data/android-ndk-r13b/toolchains/llvm/prebuilt/linux-x86_64/include
export CK_ENV_LIB_STDCPP_INCLUDE=/home/anton/data/android-ndk-r13b/sources/cxx-stl/gnu-libstdc++/include
export CK_ENV_LIB_STDCPP_INCLUDE_EXTRA=/home/anton/data/android-ndk-r13b/sources/cxx-stl/gnu-libstdc++/libs/arm64-v8a/include
export CK_FLAG_PREFIX_INCLUDE=-I

The include paths should be prefixed by /home/anton/data/android-ndk-r13b/sources/cxx-stl/gnu-libstdc++/4.9. It looks that the ver variable here is now empty after some changes to support NDK >= 17:

       ndk_path=p5
       ver=ndk_gcc.get('ver', '')[:-2]
       abi=target_d.get('abi','')

       env[ep]=pi
       env[ep+'_BIN']=p1

       if ndk_iver>=17:

No response while installing bazel on Firefly RK3399

Hi, I try to install tensorflow 1.7.0 on Firefly RK3399 via ck. After I build ck, with commend 'ck install ck-env:package:tool-bazel-0.11.1-linux', terminal can download and fetch bazel. But while compiling, I wait almost 2hrs but there is no response. How can I solve the problem?
I test on Firefly-RK3399-ubuntu16.04-20180416112819 platform.
Thanks !

There is no object-detection for TFLite

There is partially raw data after processing image by *.tflite-model like mentioned here: tensorflow/tensorflow#15633 (comment). That data has format 91*1917(scores) + 4*1917(raw coordinates) float numbers (class_count*box_count + coordinates_points*box_count). Scores aren't normalized, but almost each box contains high probability for some class (after normalizing by "box" row). "Raw" coordinates looks inconsistently ([-0.03030161 1.4208953 -0.01799502 1.3256061 ] ) after processing by any of scripts from https://github.com/tensorflow/models/research/object_detection/box_coders as mentioned in tf-coreml/tf-coreml#107 (comment)

image-classification-tf-py imports natively installed TensorFlow module instead of presented in package

See image-classification-tf-py program, classify.py:

import tensorflow as tf

It is expected that tensorflow from selected TF package will be used because we have PYTHONPATH assigned in package's env, e.g.:

export PYTHONPATH=/home/nchunosov/CK-TOOLS/lib-tensorflow-sycl-opencl-master-compiler.python-3.5.2-lib.opencl-linux-64/lib:/home/nchunosov/CK-TOOLS/lib-tensorflow-sycl-opencl-master-compiler.python-3.5.2-lib.opencl-linux-64/lib/external/protobuf_archive/python:${PYTHONPATH}

BUT, as was revealed in #77, if natively installed TensorFlow library is presented, then its module is used instead.

We have to force Python to use package's module.

Installation of package lib-tensorflow-cpu fails

Steps to reproduce

$ ck pull repo:ck-tensorflow - ok
$ ck install package:lib-tensorflow-cpu
fails with errors:

Cloning from 'https://github.com/bazelbuild/bazel.git' ...
fatal: destination path 'src' already exists and is not an empty directory.

Compiling bazel ...

Building bazel ...
/home/daniil/CK/ck-tensorflow/package/tool-bazel/install.sh: line 38: cd: output: No such file or directory
Error: building bazel failed!
CK error: [package] package installation failed!

TF-Lite GPU benchmark results?

Are any TF-Lite GPU benchmark results for mobile phone are available?

Tensorflow Lite and/or cross compilation

I am new to CK, and have been using Tensorflow Lite on an embedded ARM-based device. I have two questions:

Does this package also build Tensorflow Lite, or just the whole of Tensorflow?
Does it also support cross-compilation? My device is quite resource limited and slow, and I would prefer to avoid a native build on it, as I do now.

Thanks a lot.

tf lite armeabi version is much slower than the nightly aar

   At first I was using  'org.tensorflow:tensorflow-lite:0.0.0-nightly'  aar in my own project. My tf lite model costs 900ms. But in my company project, the ndk abiFilters is armeabi.  I changed abiFilters from armeabi-v7a to armeabi. However,  I found the aar not support armeabi.

I compile the armeabi version with the following command, and import so and jar file, but my model costs 4000ms. Is there some wrong with the command ??
bazel build --cxxopt='--std=c++11' //tensorflow/lite/java:tensorflowlite
--crosstool_top=//external:android/crosstool
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain
--cpu=armeabi

SqueezeDet/use_continuous fails with "cannot import name _draw_box"

When using TensorFlow (installed from the 1.0.1 or 1.1.0 prebuilt packages):

$ ck run program:squeezedet --cmd_key=use_continuous

fails with the following message printed to stderr (tmp/tmp-output2.tmp):

Traceback (most recent call last):
  File "../continuous.py", line 24, in <module>
  from train import _draw_box
ImportError: cannot import name _draw_box

At the same time:

$ ck run program:squeezedet --cmd_key=default

works just fine.

This is on an x86_64 development board with CentOS 7 (Linux kernel 3.10).

Rename ck-tensorflow:classification-* programs

classification-tensorflow -> benchmark-tensorflow-python
classification-tensorflow-cpp -> benchmark-tensorflow-cpp
classification-tflite-cpp -> benchmark-tflite-cpp

Because we already have the set of tensorflow-classification* programs with similar names but different intention (usage of single image only and for mobile demo).

But it should be done after merging of #66 to avoid conflicts.

benchmark-googlenet fails

Running program:tensorflow and selecting benchmark-googlenet results in:

executing code ...
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
    run_benchmark()
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-googlenet/benchmark-googlenet.py", line 224, in run_benchmark
    last_layer = inference(images)
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-googlenet/benchmark-googlenet.py", line 154, in inference
    incept3a = _inception(pool3,    192, 64, 96, 128, 16, 32, 3, 32)
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-googlenet/benchmark-googlenet.py", line 131, in _inception
    incept = tf.concat(channel_dim, [conv1, conv3, conv5, pool])
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/ops/array_ops.py", line 1047, in concat
    dtype=dtypes.int32).get_shape(
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
    as_ref=False)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/constant_op.py", line 165, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/tensor_    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

Tested with 3 freshly installed versions:

$ ck show env --tags=tensorflow
Env UID:         Target OS: Bits: Name:                               Version:       Tags:

4b0c4fb7a608c8cf   linux-64    64 TensorFlow library (prebuilt, cuda) 1.0.0          64bits,cuda,host-os-linux-64,lib,prebuilt,target-os-linux-64,tensorflow,tensorflow-cuda,v1,v1.0,v1.0.0
6dca1226a214393b   linux-64    64 TensorFlow library (prebuilt, cpu)  1.0.0          64bits,cpu,host-os-linux-64,lib,prebuilt,target-os-linux-64,tensorflow,tensorflow-cpu,v1,v1.0,v1.0.0
bdcf77dc8e5843fb   linux-64    64 TensorFlow library (cpu)            master-2e8cf80 64bits,host-os-linux-64,lib,target-os-linux-64,tensorflow,tensorflow-cpu,v0,v0.0

Full log: googlenet.txt

Create Docker image with stable CK for CK+TF+MLPerf

Let's create a Docker image with stable CK repositories for TensorFlow and our reference MLPerf workflows. It shouldn't be very difficult and allow the community to use:
a) latest CK workflows for reference MLPerf implementation (may sometimes fail in latest environment)
b) stable implementation which should always work but may not use latest frameworks and environments.

Building package:lib-tensorflow-1.11.0-src-cuda-xla fails.

Inside "ctuning/ck-ubuntu-18.04" docker container "ck install package:lib-tensorflow-1.11.0-src-cuda-xla" fails with:

err.txt

Trying again gives different errors (due to race conditions?)

err2.txt

OpenCL version of CK-TensorFlow failed

I added OpenCL wrapper for TensorFlow to CK and it compiles:

$ ck pull repo:ck-tensorflow
$ ck install package:lib-tensorflow-opencl

(Note that you need CodePlay's computecpp)
But then the basic benchmark doesnt' start:
$ ck run program:tensorflow

Traceback (most recent call last):
File "../alexnet_benchmark.py", line 43, in
import tensorflow as tf
File "/home/fursin/CK-TOOLS/tensorflow-opencl-master-lib.opencl-8.0-linux-64/lib/tensorflow/init.py", line 24, in
from tensorflow.python import *
File "/home/fursin/CK-TOOLS/tensorflow-opencl-master-lib.opencl-8.0-linux-64/lib/tensorflow/python/init.py", line 60, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/home/fursin/CK-TOOLS/tensorflow-opencl-master-lib.opencl-8.0-linux-64/lib/tensorflow/python/init.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "/home/fursin/CK-TOOLS/tensorflow-opencl-master-lib.opencl-8.0-linux-64/lib/tensorflow/python/pywrap_tensorflow.py", line 28, in
_pywrap_tensorflow = swig_import_helper()
File "/home/fursin/CK-TOOLS/tensorflow-opencl-master-lib.opencl-8.0-linux-64/lib/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: /home/fursin/CK-TOOLS/tensorflow-opencl-master-lib.opencl-8.0-linux-64/lib/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: _ZN2cl4sycl7program30create_program_for_kernel_implENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPKhiPPKcSt10shared_ptrINS0_6detail7contextEE

Any help is appreciated.

Create a TensorFlow 1.7 package with TensorRT support

TensorFlow 1.7 with CUDA optionally supports TensorRT, so it asks the following during installation:

$ ck install package:lib-tensorflow-1.7.0-src-cuda-xla
...
Do you wish to build TensorFlow with Apache Kafka Platform support? [y/N]: 
No Apache Kafka Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: 
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]:/usr/local/TensorRT-3.0.4

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:

We could probably automate this by a special package CK forcing the y answer and supplying the path to TensorRT.

CK-TF support on Tegra TX1 (GPU)

CK-TF should support installation on Tegra TX1 (GPU), in addition to #6. Unfortunately, people have had issues with installing TF-CUDA on TX1 (e.g. see this internal driver error). Somebody from NVIDIA, however, has reassured the crowd that these issues should be resolved in the next public release. Fingers crossed, by the next release he means JetPack 3.0 (to become available on 14 March 2017). We should give it a try when it's out.

Building package:lib-tensorflow-1.x.y-src-static for Android fails for x > 10

Late in the build process, an archiving command fails due to an extra long argument list for x=13:

$ ck install package:lib-tensorflow-1.13.1-src-static --target_os=android24-arm64
...
make: execvp: /bin/bash: Argument list too long
make: *** [/home/anton/CK_TOOLS/lib-tensorflow-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/contrib/makefile/gen/lib/android_arm64-v8a/libtensorflow-core.a] Error 127
cp: cannot stat '/home/anton/CK_TOOLS/lib-tensorflow-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/contrib/makefile/gen/lib/android_arm64-v8a/libtensorflow-core.a': No such file or directory

Same for x=11:

$ ck install package:lib-tensorflow-1.11.0-src-static --target_os=android24-arm64

although x=10 is fine:

$ ck install package:lib-tensorflow-1.10.1-src-static --target_os=android24-arm64

TensorFlow dependencies

I installed TensorFlow 1.10.1 on a clean Ubuntu 18.04 system:

$ ck install package:lib-tensorflow-1.10.1-cpu

On the first run of program:image-classification-tf-py, however, SciPy got detected and corrupted the same environment entry:

$ ck show env --tags=tensorflow
Env UID:         Target OS: Bits: Name:                                Version: Tags:

5eaeb467192af25c   linux-64    64 SciPy Python library (prebuilt, cpu) 1.10.1   64bits,channel-stable,host-os-linux-64,lib,needs-python,needs-python-3.6.5,python-package,scipy,target-os-linux-64,tensorflow,tensorflow-cpu,v1,v1.10,v1.10.1,vcpu,vprebuilt

(see the tags containing vcpu,vprebuilt on the one hand and scipy on the other.)

What's weird is that even after removing this env entry:

$ ck rm env:5eaeb467192af25c
Are you sure to delete CK entry "SciPy Python library (prebuilt, cpu)"
    env:5eaeb467192af25c (9b9b3208ac44b891:5eaeb467192af25c) ? (y/N): y
   Entry "SciPy Python library (prebuilt, cpu)"
    env:5eaeb467192af25c (9b9b3208ac44b891:5eaeb467192af25c) was successfully deleted!

CK still detects it and assigns the same env entry:

$ ck detect soft:lib.python.scipy
...
  Detecting and sorting versions (ignore some work output) ...

    * /home/anton/CK_TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.6.5-linux-64/lib/scipy/__init__.py   (Version 1.10.1)

  Found pre-recorded CK installation info ...

  Registering in the CK (/home/anton/CK_TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.6.5-linux-64/lib/scipy/__init__.py) ...

  Software entry found: lib.python.scipy (4460bdb0ade2a3df)

Moreover, detecting NumPy overwrites it:

$ ck detect soft:lib.python.numpy
...
  Detecting and sorting versions (ignore some work output) ...

    * /home/anton/CK_TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.6.5-linux-64/lib/numpy/__init__.py   (Version 1.10.1)

  Found pre-recorded CK installation info ...

  Registering in the CK (/home/anton/CK_TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.6.5-linux-64/lib/numpy/__init__.py) ...

  Software entry found: lib.python.numpy (6a10047b1bcd16fc)

Environment entry updated (5eaeb467192af25c)!
  Successfully registered with UID: 5eaeb467192af25c

It seems that for this program I need to have all the three dependencies resolved: TensorFlow, NumPy and SciPy, but can only have one at a time!

This probably happens because of ck-install.json containing:

  "env_data_uoa": "5eaeb467192af25c",

and all the three dependencies are being located under lib/.

Clean extra files for package:tensorflowmodel-mobilenet-v1-0.50-*-2018_02_22-py

When I noticed that all the new (2018_02_22) 0.50 packages (package:tensorflowmodel-mobilenet-v1-0.50-*-2018_02_22-py) had problems with downloading the weights:

$ ck install package --tags=mobilenet-v1-all,tensorflowmodel,2018_02_22
$ cd $CK_TOOLS && du -hsc tensorflowmodel-mobilenet-v1-*-2018_02_22-py
11M     tensorflowmodel-mobilenet-v1-0.25-128-2018_02_22-py
11M     tensorflowmodel-mobilenet-v1-0.25-160-2018_02_22-py
11M     tensorflowmodel-mobilenet-v1-0.25-192-2018_02_22-py
11M     tensorflowmodel-mobilenet-v1-0.25-224-2018_02_22-py
36K     tensorflowmodel-mobilenet-v1-0.50-128-2018_02_22-py
36K     tensorflowmodel-mobilenet-v1-0.50-160-2018_02_22-py
36K     tensorflowmodel-mobilenet-v1-0.50-192-2018_02_22-py
36K     tensorflowmodel-mobilenet-v1-0.50-224-2018_02_22-py
43M     tensorflowmodel-mobilenet-v1-0.75-128-2018_02_22-py
43M     tensorflowmodel-mobilenet-v1-0.75-160-2018_02_22-py
43M     tensorflowmodel-mobilenet-v1-0.75-192-2018_02_22-py
43M     tensorflowmodel-mobilenet-v1-0.75-224-2018_02_22-py
69M     tensorflowmodel-mobilenet-v1-1.0-128-2018_02_22-py
69M     tensorflowmodel-mobilenet-v1-1.0-160-2018_02_22-py
69M     tensorflowmodel-mobilenet-v1-1.0-192-2018_02_22-py
69M     tensorflowmodel-mobilenet-v1-1.0-224-2018_02_22-py
486M    total

I manually corrected the 0.50 packages similar to the following:

       "MODULE_FILE": "mobilenet-model.py",
-      "PACKAGE_NAME": "mobilenet_v1_0.50_128.tgz",
+      "PACKAGE_NAME": "mobilenet_v1_0.5_128.tgz",
       "PACKAGE_URL": "http://download.tensorflow.org/models/mobilenet_v1_2018_02_22",
-      "WEIGHTS_FILE": "mobilenet_v1_0.50_128.ckpt"
+      "WEIGHTS_FILE": "mobilenet_v1_0.5_128.ckpt"
     },

Now the weights are properly downloaded, but compared to non-0.50 packages contain extra files:

$ ls -la ~/CK_TOOLS/tensorflowmodel-mobilenet-v1-0.50-224-2018_02_22-py/
total 35280
drwxr-x---  2 anton anton     4096 May  2 12:21 .
drwxrwxr-x 64 anton anton     4096 May  2 12:23 ..
-rw-rw-r--  1 anton anton     1896 May  2 12:21 ck-install.json
-rw-rw-r--  1 anton anton     4810 May  2 12:21 mobilenet-model.py
-rw-rw-r--  1 anton anton    20309 May  2 12:21 mobilenet_v1.py
-rw-r-----  1 anton anton 21401248 Feb 23 00:51 mobilenet_v1_0.5_224.ckpt.data-00000-of-00001
-rw-r-----  1 anton anton    19775 Feb 23 00:51 mobilenet_v1_0.5_224.ckpt.index
-rw-r-----  1 anton anton  3363922 Feb 23 00:51 mobilenet_v1_0.5_224.ckpt.meta
-rw-r-----  1 anton anton  5319064 Feb 23 01:15 mobilenet_v1_0.5_224.tflite
-rw-r-----  1 anton anton   532344 Feb 23 01:15 mobilenet_v1_0.5_224_eval.pbtxt
-rw-r-----  1 anton anton  5437736 Feb 23 01:15 mobilenet_v1_0.5_224_frozen.pb
-rw-r-----  1 anton anton       83 Feb 23 01:15 mobilenet_v1_0.5_224_info.txt
$ ls -la ~/CK_TOOLS/tensorflowmodel-mobilenet-v1-0.75-224-2018_02_22-py/              
total 43896
drwxr-x---  2 anton anton     4096 May  2 12:22 .
drwxrwxr-x 64 anton anton     4096 May  2 12:23 ..
-rw-rw-r--  1 anton anton     1898 May  2 12:22 ck-install.json
-rw-rw-r--  1 anton anton     4810 May  2 12:22 mobilenet-model.py
-rw-rw-r--  1 anton anton    20309 May  2 12:22 mobilenet_v1.py
-rw-r-----  1 anton anton 41512608 Feb 23 00:51 mobilenet_v1_0.75_224.ckpt.data-00000-of-00001
-rw-r-----  1 anton anton    19824 Feb 23 00:51 mobilenet_v1_0.75_224.ckpt.index
-rw-r-----  1 anton anton  3363922 Feb 23 00:51 mobilenet_v1_0.75_224.ckpt.meta
-rw-r-----  1 anton anton       84 Feb 23 01:16 mobilenet_v1_0.75_224_info.txt

We should correct two things:

Check in the script for creating and updating MobileNets packages that for the new 0.50 packages, the file names contain 0.5, not 0.50 (e.g. mobilenet_v1_0.5_128.ckpt, not mobilenet_v1_0.50_128.ckpt).
Check in the MobileNets installation script that the extra 0.5 files for also get deleted.

Improving TF installation with versions < 1.4.0 without sudo

We unified installation of TF for versions >=1.4.0 which doesn't require SUDO anymore and installs most of Python deps to local user space (CK virtual environments). However, CK installer for TF < 1.4.0 is still old. We should try to update it since there is still legacy code that may require such packages!

CC @ens-lg4 @psyhtest @gpekhimenko @SerailHydra

Build TF with CPU vector support

When running TF benchmarks on an Intel Xeon, warnings appear:

W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

There's probably nothing we can do when using prebuilt TF libraries, but when we build TF from the sources we should enable vector instructions depending on the target CPU support.

Problem compiling TensorFlow with Clang via CK

I tried to compile TensorFlow with detected clang 3.8 on my Ubuntu 16.04 with Intel processor but it failed on linking stage when linking protobuf. GCC works fine. If anyone has time to reproduce and fix that, I would appreciate that!

Problem compiling TensorFlow for Android using Clang via CK

I didn't manage to compile TensorFlow for Android using Clang (from Android NDK), due to some strange error that clang is detected, but then full path used by TensorFlow make is wrong. Need to check it later - maybe some obvious mistake with registering clang in the CK env ...

Set up path to bundled protobuf

Since TensorFlow depends on protobuf, it usually bundles it under lib/external/protobuf_archive/python. However, CK-TensorFlow sets PYTHONPATH in env.sh as follows:

PYTHONPATH=/home/anton/CK_TOOLS/lib-tensorflow-src-cpu-xla-1.7-linux-64/lib:${PYTHONPATH}

so the bundled protobuf is not visible.

This means that if protobuf is not installed system-wide (e.g. via pip), the following error occurs on importing TensorFlow:

$ ck virtual env --tags=lib,tensorflow,vcpu,v1.7

Warning: you are in a new shell with a pre-set CK environment. Enter "exit" to return to the original one!
$ python -c "import tensorflow as tf; print(tf.__version__)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/anton/CK_TOOLS/lib-tensorflow-src-cpu-xla-1.7-linux-64/lib/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *  # pylint: disable=redefined-builtin
  File "/home/anton/CK_TOOLS/lib-tensorflow-src-cpu-xla-1.7-linux-64/lib/tensorflow/python/__init__.py", line 52, in <module>
    from tensorflow.core.framework.graph_pb2 import *
  File "/home/anton/CK_TOOLS/lib-tensorflow-src-cpu-xla-1.7-linux-64/lib/tensorflow/core/framework/graph_pb2.py", line 6, in <module>
    from google.protobuf import descriptor as _descriptor
ImportError: No module named google.protobuf

We should extend the environment to set up PYTHONPATH e.g. as follows:

export PYTHONPATH=/home/anton/CK_TOOLS/lib-tensorflow-src-cuda-xla-1.7-linux-64/lib:/home/anton/CK_TOOLS/lib-tensorflow-src-cuda-xla-1.7-linux-64/lib/external/protobuf_archive/python:${PYTHONPATH}

TensorFlow CPU and CPU-XLA map to the same environment on AArch64 platforms

When installing TensorFlow 1.7.0 on two 64-bit Arm platforms (including Jetson TX1) with and without XLA, I noticed that CK registered them under the same environment UID (more precisely, updating the environment created during the first installation for the second installation):

anton@tegra-ubuntu:~$ ck install package:lib-tensorflow-1.7.0-src-cpu-xla --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=1
...
Installation time: 35946.103354 sec.
anton@tegra-ubuntu:~$ ck install package:lib-tensorflow-1.7.0-src-cpu --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=1 && date
...
Installation time: 21755.3534448 sec.
$ ck show env --tags=lib,tensorflow
Env UID:         Target OS: Bits: Name:                                  Version: Tags:

f17bebf9b6a4e759   linux-64    64 TensorFlow library (from sources, cpu) 1.7      64bits,bazel,channel-stable,host-os-linux-64,lib,needs-bazel,needs-bazel-0.11.1,target-os-linux-64,tensorflow,tensorflow-cpu,v1,v1.7,v1.7.0,vcpu,vsrc

$CK_TOOLS does contain both versions (disregard __init__.py being both under src and lib):

$ ck detect soft:lib.tensorflow

  Searching for TensorFlow library (tensorflow/__init__.py) to automatically register in the CK - it may take some time, please wait ...

    * Searching in /usr ...
    * Searching in /opt ...
    * Searching in /home/anton/CK_TOOLS ...
    * Searching in /home/anton ...
...

  Registering software installations found on your machine in the CK:

    (HINT: enter -1 to force CK package installation)

    0) Version 1.7 - /home/anton/CK_TOOLS/lib-tensorflow-src-cpu-xla-1.7-linux-64/src/bazel-src/tensorflow/__init__.py
    1) Version 1.7 - /home/anton/CK_TOOLS/lib-tensorflow-src-cpu-xla-1.7-linux-64/lib/tensorflow/__init__.py
    2) Version 1.7 - /home/anton/CK_TOOLS/lib-tensorflow-src-cpu-1.7-linux-64/src/bazel-src/tensorflow/__init__.py
    3) Version 1.7 - /home/anton/CK_TOOLS/lib-tensorflow-src-cpu-1.7-linux-64/lib/tensorflow/__init__.py

but selecting the other version overwrites the same environment:

...
Environment entry updated (f17bebf9b6a4e759)!
  Successfully registered with UID: f17bebf9b6a4e759

anton@tegra-ubuntu:~$ ck show env --tags=lib,tensorflow
Env UID:         Target OS: Bits: Name:                                       Version: Tags:

f17bebf9b6a4e759   linux-64    64 TensorFlow library (from sources, cpu, xla) 1.7      64bits,bazel,channel-stable,host-os-linux-64,lib,needs-bazel,needs-bazel-0.11.1,target-os-linux-64,tensorflow,tensorflow-cpu,v1,v1.7,v1.7.0,vcpu,vsrc,vxla

Prebuilt TensorFlow requires system protobuf

$ sudo python3 -m pip uninstall protobuf
$ ck install package:lib-tensorflow-1.10.1-cpu
$ ck run program:image-classification-tf-py
...
    * stderr.log

      Traceback (most recent call last):
        File "../classify.py", line 15, in <module>
          import tensorflow as tf
        File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/tensorflow/__init__.py", line 22, in <module>
          from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
        File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/tensorflow/python/__init__.py", line 52, in <module>
          from tensorflow.core.framework.graph_pb2 import *
        File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cpu-1.10.1-compiler.python-3.5.2-linux-64/lib/tensorflow/core/framework/graph_pb2.py", line 6, in <module>
          from google.protobuf import descriptor as _descriptor
      ImportError: No module named 'google.protobuf'

scipy, numpy and other modules in pre/post processing scripts

Hi all!
Here are my thoughts about TensorFlow problem with postprocessing scripts which has scipy/numpy ...

Once again it's a tricky situation since we call pre/post-processing scripts from the CK python installation, not from TensorFlow, etc. And we can't just mix CK python with other pythons from packages, since they can be different (!).

My general view on CK modules and pre/post-processing scripts within CK workflow space is that they should be very simple and portable without many dependencies on external modules, i.e. mainly just APIs.

However, if this is strictly necessary (we have a few cases already in stats), then this should be installed together with the CK, i.e. where we describe how to install python and python-pip, we should also ask to install extra dependencies needed for CK modules: scipy, numpy.

ON THE OTHER HAND, if we require complex functionality in the pre/post processing scripts from installed packages, we then use pre/post-processing scripts again just as simple and portable wrappers (we have deps resolved there usually) where we set up require deps and call a new and complex sub-script with all modules from an installed package via system call.

IN FACT, we can create a simple API in ck-env:module:os which will do it (and maybe pass input/output via some tmp JSON file) ...

AS A CONCLUSION:

for now we can just install missing scipy/numpy in Travis at the same time as we install python; we also note in ReadMe that users need to install them!
when someone has time, please try to create a function in ck-env:module:os (or env) to set up deps and call python module - in such case, we can move scipy, numpy from pre/post processing TF modules to new ones, and call them from pre/post processing modules via above API - it will set up env for installed TF, will call complex sub-module, and will collect all necessary info ...

If it's not clear, please tell me and we can discuss it further! Thanks!

HTTPError: 404 Client Error: Not Found for tensorflow-1.12.0-cp37-cp37m-linux_x86_64.whl

I got an error while install tensorflow using ck.
python3 version: 3.7.3
ck version: 1.9.9

Successfully installed django easydict enum-compat image joblib numpy pillow protobuf pytz scipy setuptools six sqlparse

Conditionally cleaning up the 'enum34' package...

Usage:
  /usr/bin/python3.7 -m pip uninstall [options] <package> ...
  /usr/bin/python3.7 -m pip uninstall [options] -r <requirements file> ...

no such option: --system

Downloading and installing TensorFlow prebuilt binaries (https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.12.0-cp37-cp37m-linux_x86_64.whl) ...

Collecting tensorflow==1.12.0 from https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.12.0-cp37-cp37m-linux_x86_64.whl
/usr/share/python-wheels/urllib3-1.22-py2.py3-none-any.whl/urllib3/connectionpool.py:860: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
Exception:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 353, in run
    wb.build(autobuilding=True)
  File "/usr/lib/python3/dist-packages/pip/wheel.py", line 749, in build
    self.requirement_set.prepare_files(self.finder)
  File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 380, in prepare_files
    ignore_dependencies=self.ignore_dependencies))
  File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 620, in _prepare_file
    session=self.session, hashes=hashes)
  File "/usr/lib/python3/dist-packages/pip/download.py", line 821, in unpack_url
    hashes=hashes
  File "/usr/lib/python3/dist-packages/pip/download.py", line 659, in unpack_http_url
    hashes)
  File "/usr/lib/python3/dist-packages/pip/download.py", line 855, in _download_http_url
    resp.raise_for_status()
  File "/usr/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/models.py", line 935, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
**pip._vendor.requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.12.0-cp37-cp37m-linux_x86_64.whl**
Error: installation failed!

   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   CK detected a PROBLEM in the third-party CK package:

   CK package:           lib-tensorflow-1.12.0-cpu
   CK repo:              ck-tensorflow
   CK repo URL:          https://github.com/ctuning/ck-tensorflow
   CK package URL:       https://github.com/ctuning/ck-tensorflow/tree/master/package/lib-tensorflow-1.12.0-cpu
   Issues URL:           https://github.com/ctuning/ck-tensorflow/issues
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   Please, submit the log to the authors of this external CK package at "https://github.com/ctuning/ck-tensorflow/issues" to collaboratively fix this problem!

CK error: [package] package installation failed!

benchmark-alexnet warns

benchmark-alexnet should be updated similarly to benchmark-overfeat and benchmark-googlenet to silence this warning:

WARNING:tensorflow:From /home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-alexnet/benchmark-alexnet.py:212: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.

     # Build an initialization operation.
 -    init = tf.initialize_all_variables()
 +    init = tf.global_variables_initializer()

Common information for all image-recognition packages.

The problem is not related just for TF packages, but seems we have a plenty of them, so I put it here. If needed, we can move it to more common repo.

While working on common preprocessing scripts for benchmarking programs (#66) I've realized that we have a lack of some general knowledge about image recognition packages. I have to use vars like CK_ENV_TENSORFLOW_MODEL_IMAGE_WIDTH and it makes these scripts not so common as they should be.

Suggestion is each image recognition package, no matter TF or Caffe or something else, should provide set for common variables, at least:

input image size
number of recognizable classes (it can differ even for ImageNet, e.g. 1001 for Mobilenet)
weights file name and path (not CK_ENV_TENSORFLOW_MODEL_WEIGHTS but something more library independent)

TF 1.2 manual build via CMake

Hi,

In the tf-cmake-1.2 branch, I created lib-tensorflow-cmake-1.2 package for building TF 1.2 via CMake, just like the lib-tensorflow-cmake builds TF 1.1. In fact, it's an exact copy of the 1.1 package, I've just changed the tag being built from 1.1 to 1.2.

However, this package fails to build with a bunch of errors like the ones below (at least when built on Tegra with aarch64 arch). I don't have time right now to fix it, so creating this ticket to not forget to fix it in the future and merge into master:

/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:376:25: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:377:36: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
         pstore (py+PacketSize, psub(pcj.pmul(pc,yi1),pm.pmul(ps,xi1)));
                                    ^
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:377:36: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:385:35: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
         pstoreu(x+peelingEnd, padd(pm.pmul(pc,xi),pcj.pmul(ps,yi)));
                                   ^
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:385:35: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:386:35: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
         pstore (y+peelingEnd, psub(pcj.pmul(pc,yi),pm.pmul(ps,xi)));
                                   ^
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:386:35: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:415:22: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
       pstore(px, padd(pm.pmul(pc,xi),pcj.pmul(ps,yi)));
                      ^
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:415:22: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:416:22: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
       pstore(py, psub(pcj.pmul(pc,yi),pm.pmul(ps,xi)));
                      ^
/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/contrib/cmake/build/external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:416:22: error: ‘struct Eigen::internal::conj_helper<__vector(2) double, Eigen::internal::Packet1cd, false, false>’ has no member named ‘pmul’
CMakeFiles/tf_core_kernels.dir/build.make:3398: recipe for target 'CMakeFiles/tf_core_kernels.dir/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/core/kernels/svd_op_complex128.cc.o' failed
make[3]: *** [CMakeFiles/tf_core_kernels.dir/home/daniil/CK-TOOLS/lib-tensorflow-cmake-1.2.0-gcc-5.4.0-linux-64/src/tensorflow/core/kernels/svd_op_complex128.cc.o] Error 1
CMakeFiles/Makefile2:1591: recipe for target 'CMakeFiles/tf_core_kernels.dir/all' failed
make[2]: *** [CMakeFiles/tf_core_kernels.dir/all] Error 2
CMakeFiles/Makefile2:79: recipe for target 'CMakeFiles/tf_python_build_pip_package.dir/rule' failed
make[1]: *** [CMakeFiles/tf_python_build_pip_package.dir/rule] Error 2
Makefile:118: recipe for target 'tf_python_build_pip_package' failed
make: *** [tf_python_build_pip_package] Error 2
CK error: [package] package installation failed!

Universal install script for shared_cc packages

Currently we have universal install script for src packages - scripts/install-lib-tensorflow-src. In the same way, universal install scripts for shared_cc packages could be implemented - scripts/install-lib-tensorflow-shared_cc basing on their original build_tensorflow.sh script.

The same script should be applicable for both package flavours - as cpu so as cuda (#46)

Create package:lib-tflite

We've accumulated a number of packages building TFLite from source e.g. package:lib-tflite-1.15.0-src-static. We should create a single package:lib-tflite-src-static (or simply package:lib-tflite) to keep versioning and patching in one place via the mechanism of variations.

The package:lib-armnn package provides good working examples of:

source tag and version control;
generic and revision-specific patching;
target-specific augmentation (e.g. need to pass EXTRA_CXXFLAGS="-march=armv7-a+neon+vfpv4 -mfpu=neon-vfpv4" when building on Raspberry Pi 4).

The immediate versions of interest are 1.15.3, 1.13.2, and 2.0.2.

The goal is to be able to install TFLite from the new package e.g. as follows:

$ ck install package --tags=lib,tflite,v2.0.2
$ ck install package --tags=lib,tflite,v1.15.3,rpi4

ck-tensorflow/program/image-classification-tf-py/classify.py is not executable.

$ ck run program:tensorflow --cmd_key=classify

tensorflow (62de6ce26934e3eb)

OS CK UOA: linux-64 (4258b5fe54828a50)

OS name: Ubuntu 16.04.4 LTS
Short OS name: Linux 4.15.0
Long OS name: Linux-4.15.0-29-generic-x86_64-with-debian-stretch-sid
OS bits: 64
OS ABI: x86_64

Platform init UOA: -

Current directory: /home/hello/CK/ck-tensorflow/program/tensorflow/tmp

Resolving software dependencies ...

*** Dependency 1 = lib-tensorflow (TensorFlow library):

Resolved. CK environment UID = 28433f8d33a6f186

*** Dependency 2 = tensorflow-model (TensorFlow model (net and weights)):

Resolved. CK environment UID = c8f9d4473da5995d (version 20151205)

More than one dataset entry is found for this program:

image-jpeg-0001 (1aaaa23c44e588f9)
image-jpeg-dnn-cat (e496fdf046e6ac13)
image-jpeg-dnn-cat-gray (b6edb4314645efa0)
image-jpeg-dnn-computer-mouse (04df4936ba03c285)
image-jpeg-dnn-cropped-panda (0a4f26aa98034fd2)
image-jpeg-dnn-fish-bike (ebd96e7048ba3f7f)
image-jpeg-dnn-snake-224 (08d57fd1a6443ee9)
image-jpeg-dnn-surfers (b04c3400ba69fdc6)
image-jpeg-fgg (fgg, 21500da03e4ccfa3)

Select UOA (or press Enter for 0): 0

Cleaning output files and directories:
stderr.log
stderr2.log
stderr.log
stderr2.log

Prepared script:

#! /bin/bash

. /home/hello/CK/local/env/28433f8d33a6f186/env.sh
. /home/hello/CK/local/env/c8f9d4473da5995d/env.sh

export CK_DATASET_PATH=/home/hello/CK/ctuning-datasets-min/dataset/image-jpeg-0001/

export BATCH_SIZE=5
export CK_DATASET_FILENAME=data.jpg
export NUM_BATCHES=5

echo executing code ...
${CK_ENV_COMPILER_PYTHON_FILE} ../classify.py --model_dir=${CK_ENV_MODEL_TENSORFLOW} --image_file=/home/hello/CK/ctuning-datasets-min/dataset/image-jpeg-0001/data.jpg > stderr.log 2> stderr2.log

(bash -c "chmod 755 ./tmp-iIf2Zt.sh; . ./tmp-iIf2Zt.sh")

(sleep 0.5 sec ...)

(run ...)
executing code ...

(printing output files)

* stderr2.log

  ./tmp-iIf2Zt.sh: line 15: ../classify.py: Permission denied
  

* stderr.log

Execution time: 0.011 sec.

Program execution likely failed (return code 126 !=0 )!

we've check this file:
$ ls ck-tensorflow/program/image-classification-tf-py/ -lh
total 36K
-rwxrwxr-x 1 hello hello 8.9K 6月 29 21:24 benchmark.nvidia-gtx1080.py
-rwxrwxr-x 1 hello hello 8.9K 6月 29 21:24 benchmark.nvidia-tx1.py
-rw-rw-r-- 1 hello hello 6.5K 6月 29 21:24 classify.py
-rw-rw-r-- 1 hello hello 3.7K 6月 29 21:24 README.md

classify.py is not executable.

TensorFlow package with support of OpenCL (SYCL)

TFLite 1.13.1 installation fails on Ubuntu 18.04

On a Ubuntu 18.04, I get:

$ ck install package:lib-tflite-1.13.1-src-static [--target_os=android23-arm64]
...
downloading https://mirror.bazel.build/github.com/google/farmhash/archive/816a4ae622e964763ca0862d9dbd19324a1eaf45.tar.gz
downloading https://github.com/google/flatbuffers/archive/1f5eae5d6a135ff6811724f6c57f911d1f46bb15.tar.gz

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

This is weird 'cause:

$ ck install package:lib-tensorflow-1.13.1-src-static [--target_os=android23-arm64]

works just fine.

Any ideas?

Split shared_cc packages into cpu and cuda

lib-tensorflow_cc-shared-1.3.0
lib-tensorflow_cc-shared-1.5.0
lib-tensorflow_cc-shared-1.6.0
lib-tensorflow_cc-shared-1.7.0
lib-tensorflow_cc-shared-1.8.0
lib-tensorflow_cc-shared-1.9.0 (TBD)

Packages should be splitted into two variants - lib-tensorflow_cc-shared-*-cpu and lib-tensorflow_cc-shared-*-cuda. See packages lib-tensorflow_cc-shared-1.4.0-cpu and lib-tensorflow_cc-shared-1.4.0-cuda for example.

The reason is build_tensorflow.sh script searches for cuda and build cuda-related packages if found it. So we can't build cpu-related package on a system having cuda installed.

All the packages should be built with single universal script - #47

Building TFLite 1.13.1 fails due to dlsym linking error

Building TFLite 1.13.1 fails on Ubuntu 18.04 with Android NDK r13b (installed via apt install google-android-ndk-installer) as follows:

$ ck install package:lib-tflite-1.13.1-src-static --target_os=android24-arm64
...
         /home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/lite/tools/make/gen/ANDROID_x86_64/lib/libtensorflow-lite.a  -lstdc++ -lpthread -lm -lz
/home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/lite/tools/make/gen/ANDROID_x86_64/lib/libtensorflow-lite.a(nnapi_delegate.o): In function `tflite::NNAPIAllocation::~NNAPIAllocation()':
nnapi_delegate.cc:(.text+0x93): undefined reference to `dlsym'
/home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/lite/tools/make/gen/ANDROID_x86_64/lib/libtensorflow-lite.a(nnapi_delegate.o): In function `tflite::NNAPIAllocation::~NNAPIAllocation()':
nnapi_delegate.cc:(.text+0x1f3): undefined reference to `dlsym'
/home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/lite/tools/make/gen/ANDROID_x86_64/lib/libtensorflow-lite.a(nnapi_delegate.o): In function `tflite::NNAPIAllocation::NNAPIAllocation(char const*, tflite::ErrorReporter*)':
nnapi_delegate.cc:(.text+0x46b): undefined reference to `dlsym'
/home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/lite/tools/make/gen/ANDROID_x86_64/lib/libtensorflow-lite.a(nnapi_delegate.o): In function `tflite::NNAPIDelegate::~NNAPIDelegate()':
nnapi_delegate.cc:(.text+0x64b): undefined reference to `dlsym'
nnapi_delegate.cc:(.text+0x6bf): undefined reference to `dlsym'
/home/anton/CK_TOOLS/lib-tflite-src-static-1.13.1-android-ndk-4.9.x-android24-arm64/src/tensorflow/lite/tools/make/gen/ANDROID_x86_64/lib/libtensorflow-lite.a(nnapi_delegate.o):nnapi_delegate.cc:(.text+0xaef): more undefined references to `dlsym' follow
collect2: error: ld returned 1 exit status

It looks like we need to add -ldl to the linking command.

If that fails, we can try -Wl,--no-as-needed -ldl.

Improve the handling of SqueezeDet

Thanks to @fanranGit, CK-TensorFlow now supports the SqueezeDet artefact by @BichenWuUCB et al.

I suggest to make several improvements to this artefact to bring into a more canonical CK form.

To allow for other artefacts that use other parts of the KITTI dataset, the object detection evaluation dataset used by SqueezeDet can be called dataset-kitti-object-image-2 ("left color images of object data set") with the labels called dataset-kitti-object-label-2. The images and labels can be related to each other similarly to the imagenet-val and imagenet-aux packages of CK-Caffe. Due to their size, the user should be offered to select the installation path.
Splitting of the object data set should be handled similarly to LMDB conversion for CK-Caffe. Specifically, the raw dataset shouldn't be modified, as there may be several different random splittings into the training and validation parts. (Ideally, the splitting process should be governed by a fixed random seed, so that a splitting can be reproduced by giving the same seed.)
Currently, ck run program:squeezedet prompts for several datasets e.g. squeezedet-eval-train. Each such dataset (except squeezedet-demo) prompts the user to select one of the classification models (classifiers) e.g. vgg16. Note that such prompting is implemented as a bash script, not via CK.
I think it should be cleaner to use CK commands, rather than datasets for choosing between the demo mode, testing on the val dataset, testing on the train dataset and training proper. The dataset functionality can then be used to select between the classifiers. Furthermore, it should be cleaner to install classifiers separately, not via the SqueezeDet installation script, with packages called something like squeezedet-classifier-vgg16.
The SqueezeDet artefact should be modified (either via a patch or downloaded from a fork) to update the hardcoded paths to the ones set via CK environment variables (e.g. for KITTI). This is also relevant for the input and output images, checkpoints, temporary directories, etc. (e.g. see squeezedet_eval_val.sh).
Some performance metrics should be exposed.

Does it make sense? Anything I missed?

Clean up and test package:lib-tensorflow_cc-shared-1.4.0 on Android

Android scripts for package:lib-tensorflow_cc-shared-1.4.0 (e.g. install.sh) include code for setting up OpenCL, CLTune and CLBLAS which they shouldn't have:

export CK_CMAKE_EXTRA="${CK_CMAKE_EXTRA} \
 -DOPENCL_ROOT:PATH=${CK_ENV_LIB_OPENCL} \
 -DOPENCL_LIBRARIES:FILEPATH=${CK_ENV_LIB_OPENCL_LIB}/${CK_ENV_LIB_OPENCL_DYNAMIC_NAME} \
 -DOPENCL_INCLUDE_DIRS:PATH=${CK_ENV_LIB_OPENCL_INCLUDE} \
 -DTUNERS=ON \
 -DCLTUNE_ROOT:PATH=${CK_ENV_TOOL_CLTUNE} \
 -DCLIENTS=ON \
 -DCBLAS_INCLUDE_DIRS:PATH=${CK_ENV_LIB_OPENBLAS_INCLUDE} \
 -DCBLAS_LIBRARIES:FILEPATH=${CK_ENV_LIB_OPENBLAS_LIB}/${CK_ENV_LIB_OPENBLAS_STATIC_NAME} -lgomp \
 -DSAMPLES=ON \
 -DCLIENTS=ON \
 -DCK_REF_LIBRARIES=${CK_REF_LIBRARIES} \
 -DANDROID=ON"

Passing -lgomp is another "dirty hack" that I don't think is needed.

I guess that the Android install.sh should look more like the Linux one:

export CK_CMAKE_EXTRA="${CK_CMAKE_EXTRA} \
  -DTENSORFLOW_SHARED=ON \
  -DTENSORFLOW_STATIC=OFF \
  -DCMAKE_CXX_COMPILER=${CK_CXX_PATH_FOR_CMAKE} \
  -DCMAKE_CXX_FLAGS=${CK_CXX_FLAGS_FOR_CMAKE} \
  -DCMAKE_CC_COMPILER=${CK_CC_PATH_FOR_CMAKE} \
  -DCMAKE_CC_FLAGS=${CK_CC_FLAGS_FOR_CMAKE}"

only perhaps additionally with:

 -DCK_REF_LIBRARIES=${CK_REF_LIBRARIES}

benchmark-overfeat fails

Running program:tensorflow and selecting benchmark-overfeat results in:

executing code ...
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
WARNING:tensorflow:From /home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py:204: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Use `tf.global_variables_initializer` instead.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations..cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on you
r machine and could speed up CPU computations.c:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on you
r machine and could speed up CPU computations.c:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your m
achine and could speed up CPU computations.d.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your 
machine and could speed up CPU computations..cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your m
achine and could speed up CPU computations.er.cc:509] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:145] kernel driver does not appear to be running on this host (velociti): /proc/driver/nvidia/version does not existmon_runtime/executor.cc:594] Executor failed to create kernel. Invalid argument: CPU BiasOp only supports NHWC.
         [[Node: conv1/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/Conv2D, conv1/biases/read)]]ck (most recent call last):
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 241, in <module>
    tf.app.run()
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/platform/app.py", line 44, in runin(_sys.argv[:1] + flags_passthrough))
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 237, in main
    run_benchmark()
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 222, in run_benchmark
    timing_entries.append(time_tensorflow_run(sess, last_layer, "Forward"))
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 156, in time_tensorflow_run
    _ = session.run(target_op)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/client/session.py", line 767, in runr)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/client/session.py", line 965, in _run, options, run_metadata)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/client/session.py", line 1015, in _do_run, run_metadata)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/client/session.py", line 1035, in _do_callf, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: CPU BiasOp only supports NHWC.
         [[Node: conv1/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/Conv2D, conv1/biases/read)]]

Caused by op u'conv1/BiasAdd', defined at:
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 241, in <module>
    tf.app.run()
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/platform/app.py", line 44, in runin(_sys.argv[:1] + flags_passthrough))
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 237, in main
    run_benchmark()
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 201, in run_benchmark
    last_layer = inference(images)
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 114, in inference
    conv1 = _conv (images, 3, 64, 11, 11, 4, 4, 'VALID')
  File "/home/anton/CK_REPOS/ck-tensorflow/dataset/benchmark-overfeat/benchmark-overfeat.py", line 61, in _conv
    data_format=FLAGS.data_format),
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/ops/nn_ops.py", line 1316, in bias_add._bias_add(value, bias, data_format=data_format, name=name)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/ops/gen_nn_ops.py", line 281, in _bias_addat, name=name)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/ops.py", line 2395, in create_opault_original_op, op_def=op_def)
  File "/home/anton/CK_TOOLS/tensorflow-prebuilt-cuda-1.0.0-compiler.cuda-8.0.61-lib.cudnn-api-5.1.5-linux-64/lib/tensorflow/python/framework/ops.py", line 1264, in __init__xtract_stack()

InvalidArgumentError (see above for traceback): CPU BiasOp only supports NHWC.
         [[Node: conv1/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/Conv2D, conv1/biases/read)]]

Execution time: 1.465 sec.

Program execution likely failed (return code 1 !=0 )!

Tested with 3 freshly installed versions:

$ ck show env --tags=tensorflow
Env UID:         Target OS: Bits: Name:                               Version:       Tags:

4b0c4fb7a608c8cf   linux-64    64 TensorFlow library (prebuilt, cuda) 1.0.0          64bits,cuda,host-os-linux-64,lib,prebuilt,target-os-linux-64,tensorflow,tensorflow-cuda,v1,v1.0,v1.0.0
6dca1226a214393b   linux-64    64 TensorFlow library (prebuilt, cpu)  1.0.0          64bits,cpu,host-os-linux-64,lib,prebuilt,target-os-linux-64,tensorflow,tensorflow-cpu,v1,v1.0,v1.0.0
bdcf77dc8e5843fb   linux-64    64 TensorFlow library (cpu)            master-2e8cf80 64bits,host-os-linux-64,lib,target-os-linux-64,tensorflow,tensorflow-cpu,v0,v0.0

Full log: overfeat.txt

Build TensorFlow for Android

Build scripts for src and shared_cc packages could be modified to support building for Android.
Start from here https://www.tensorflow.org/mobile/android_build

In your copy of the TensorFlow source, update the WORKSPACE file with the location of your SDK and NDK, where it says <PATH_TO_NDK> and <PATH_TO_SDK>.

In general when compiling for Android with Bazel you need --config=android on the Bazel command line

Support fp16

We should investigate how to enable fp16 e.g. in benchmarks. This TensorFlow issue may provide some clues.

CK-TF support on Tegra TX1 (CPU)

The Tegra TX1 CPU is aarch64, while Bazel can currently only be installed from a prebuilt package for x86_64. It seems that Bazel packages exist that are not architecture-specific (e.g. bazel-0.4.4-dist.zip). Can we use them for installing TF on Tegra TX1?