The cldnn's discuss from intel

build_option::optimize_data

Hi,
This unit test worked fine with Drop 3.0 but doesn't work with Drop 5.0.
If I remove the optimize_data build option it works on Drop 5.0 too.
I don't know if this is a problem with my test or with clDNN.

test.zip

I am running this test on Core i3-6100.

Cmake failed to compile

Getting error when I do "make" -
cmake -E make_directory build && cd build && cmake -DCMAKE_BUILD_TYPE=Release .. && make

Any help? I'm running on Ubunutu 16.04

In file included from /home/ae/Documents/clDNN/src/fully_connected.cpp:18:0:
/home/ae/Documents/clDNN/src/include/fully_connected_inst.h:50:28: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
const bool bias_term() const { return !argument.bias.empty(); }
^
cc1plus: all warnings being treated as errors
src/CMakeFiles/clDNN_shlib.dir/build.make:498: recipe for target 'src/CMakeFiles/clDNN_shlib.dir/fully_connected.cpp.o' failed
make[2]: *** [src/CMakeFiles/clDNN_shlib.dir/fully_connected.cpp.o] Error 1
CMakeFiles/Makefile2:85: recipe for target 'src/CMakeFiles/clDNN_shlib.dir/all' failed
make[1]: *** [src/CMakeFiles/clDNN_shlib.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2
ae@hped800nuc2:~/Documents/clDNN/build$ cmake --version
cmake version 3.8.0

size_offset_stride_padding.html

The detailed description of the convolution class refers to a html file that seem to be missing.

"Look into docs/size_offset_stride_padding.html for description how size, offsets, stride & padding parameters work."

CPU support

hi,
I could not find how to run it on CPU only. is this possible?
I have on ly high end CPU and do not have GPU.

Thanks for your support
Best Regards
Mazda

make tag for release

0.14 is out. we should git tag it.

Is there any case to verify depthwise convolution in CLDNN?

Hi, I noticed that you have implemented several depthwise convolution related cl kernels(see below). But I didn't see any related case to use these kernel.

Also, you mentioned the validated topologies include ": AlexNet*, VGG(16,19), GoogleNet(v1,v2,v3), ResNet(50,101,152)* Faster R-CNN*, Squeezenet*, SSD_googlenet*, SSD_VGG*, PVANET*, PVANET_REID*, age_gender*, FCN* and yolo*.", but not include mobilenet.

So does it mean the depthwise convolution cl kernels haven't been verified yet? If not, do you have any samples to share?

depthwise convolution related cl kernels:
https://github.com/intel/clDNN/blob/master/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_f16_depthwise.cl
https://github.com/intel/clDNN/blob/master/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_depthwise_weights_lwg.cl
https://github.com/intel/clDNN/blob/master/kernel_selector/core/cl_kernels/convolution_gpu_byxf_af32_depthwise.cl

FPGA?

Is this OpenCL implementation supported on Intel FPGAs?

a few lines in convolution.cpp are formatted with tabs, which triggers the error 'misleading-indentation' in gcc

A few lines (38, 47, and a few others) in the source file 'convolution.cpp' are formatted using the tab character (with a tab length of 4). If they are viewed with a tab length of 8 they look incorrectly indented and trigger errors like this one in gcc:

/home/franco/cldnn/src/convolution.cpp:44:5: error: this ‘if’ clause does not guard... [-Werror=misleading-indentation] if (kernel_xy.size() != 2)

Replacing all the tabs with 4 spaces (with something like s/\t/ /g) fixes the problem and the code compiles successfully.

Support ONNX network model loading..

Hi,
now that some optimized inference DNN libs like: NV TensorRT, Windows WinML and Qualcomm Snapdragon Neural Processing Engine (NPE) SDK support loading ONNX models (or whatever format like tensorflow etc.. for ONNX seems the most commonly/broadly supported) for simplicity would be nice if clDNN supports that also..
seems like a simple mnist sample should be much shorter than :

#include <api/CPP/memory.hpp>
#include <api/CPP/topology.hpp>
#include <api/CPP/reorder.hpp>
#include <api/CPP/input_layout.hpp>
#include <api/CPP/convolution.hpp>
#include <api/CPP/data.hpp>
#include <api/CPP/pooling.hpp>
#include <api/CPP/fully_connected.hpp>
#include <api/CPP/softmax.hpp>
#include <api/CPP/engine.hpp>
#include <api/CPP/network.hpp>
#include
using namespace cldnn;
using namespace std;
const tensor::value_type
input_channels = 1,
input_size = 28,
conv1_out_channels = 20,
conv2_out_channels = 50,
conv_krnl_size = 5,
fc1_num_outs = 500,
fc2_num_outs = 10;
// Create layout with same sizes but new format.
layout create_reordering_layout(format new_format, const layout& src_layout)
{
return { src_layout.data_type, new_format, src_layout.size };
}
// Create MNIST topology
topology create_topology(const layout& in_layout, const memory& conv1_weights_mem, const memory& conv1_bias_mem )
{
auto data_type = in_layout.data_type;
// Create input_layout description
// "input" - is the primitive id inside topology
input_layout input("input", in_layout);
// Create topology object with 2 primitives
cldnn:: topology topology(
// 1. input layout primitive.
input,
// 2. reorder primitive with id "reorder_input"
reorder("reorder_input",
// input primitive for reorder (implicitly converted to primitive_id)
input,
// output layout for reorder
create_reordering_layout(format::yxfb, in_layout))
);
// Create data primitive - its content should be set already.
cldnn::data conv1_weights( "conv1_weights", conv1_weights_mem );
// Add primitive to topology
topology.add(conv1_weights);
// Emplace new primitive to topology
topology.addcldnn::data({ "conv1_bias", conv1_bias_mem });
// Emplace 2 primitives
topology.add(
// Convolution primitive with id "conv1"
convolution("conv1",
"reorder_input", // primitive id of the convolution's input
{ conv1_weights }, // weights primitive id is taken from the object
{ "conv1_bias" } // bias primitive id
),
// Pooling id: "pool1"
pooling("pool1",
"conv1", // Input: "conv1"
pooling_mode::max, // Pooling mode: MAX
spatial(2,2), // stride: 2
spatial(2,2) // kernel_size: 2
)
);
// Conv2 weights data is not available now, so just declare its layout
layout conv2_weights_layout(data_type, format::bfyx,{ conv2_out_channels, conv1_out_channels, conv_krnl_size, conv_krnl_size });
// Define the rest of topology.
topology.add(
// Input layout for conv2 weights. Data will passed by network::set_input_data()
input_layout("conv2_weights", conv2_weights_layout),
// Input layout for conv2 bias.
input_layout("conv2_bias", { data_type, format::bfyx, spatial(conv2_out_channels) }),
// Second convolution id: "conv2"
convolution("conv2",
"pool1", // Input: "pool1"
{ "conv2_weights" }, // Weights: input_layout "conv2_weights"
{ "conv2_bias" } // Bias: input_layout "conv2_bias"
),
// Second pooling id: "pool2"
pooling("pool2",
"conv2", // Input: "conv2"
pooling_mode::max, // Pooling mode: MAX
spatial(2, 2), // stride: 2
spatial(2, 2) // kernel_size: 2
),
// Fully connected (inner product) primitive id "fc1"
fully_connected("fc1",
"pool2", // Input: "pool2"
"fc1_weights", // "fc1_weights" will be added to the topology later
"fc1_bias", // will be defined later
true // Use built-in Relu. Slope is set to 0 by default.
),
// Second FC/IP primitive id: "fc2", input: "fc1".
// Weights ("fc2_weights") and biases ("fc2_bias") will be defined later.
// Built-in Relu is disabled by default.
fully_connected("fc2", "fc1", "fc2_weights", "fc2_bias"),
// The "softmax" primitive is not an input for any other,
// so it will be automatically added to network outputs.
softmax("softmax", "fc2")
);
return topology;
}
// Copy from a vector to cldnn::memory
void copy_to_memory(memory& mem, const vector& src)
{
cldnn::pointer dst(mem);
std::copy(src.begin(), src.end(), dst.begin());
}
// Execute network
int recognize_image(network& network, const memory& input_memory)
{
// Set/update network input
network.set_input_data("input", input_memory);
// Start network execution
auto outputs = network.execute();
// get_memory() blocks output generation completed
auto output = outputs.at("softmax").get_memory();
// Get direct access to output memory
cldnn::pointer out_ptr(output);
// Analyze result
auto max_element_pos = max_element(out_ptr.begin(), out_ptr.end());
return static_cast(distance(out_ptr.begin(), max_element_pos));
}
// User-defined helpers which are out of this example scope
// //////////////////////////////////////////////////////////////
// Loads file to a vector of floats.
vector load_data(const string&) { return{ 0 }; }
// Allocates memory and loads data from file.
// Memory layout is taken from file.
memory load_mem(const engine& eng, const string&) {
//return a dummy value
return memory::allocate(eng, layout{ data_types::f32, format::bfyx, { 1, 1, 1, 1 } });
}
// Load image, resize to [x,y] and store in a vector of floats
// in the order "bfyx".
vector load_image_bfyx(const string&, int, int) { return{ 0 }; }
// //////////////////////////////////////////////////////////////
int main()
{
// Use data type: float
auto data_type = type_to_data_type::value;
// Network input layout
layout in_layout(
data_type, // stored data type
format::bfyx, // data stored in order batch-channel-Y-X, where X coordinate changes first.
{1, input_channels, input_size, input_size} // batch: 1, channels: 1, Y: 28, X: 28
);
// Create memory for conv1 weights
layout conv1_weights_layout(data_type, format::bfyx,{ conv1_out_channels, input_channels, conv_krnl_size, conv_krnl_size });
vector my_own_buffer = load_data("conv1_weights.bin");
// The conv1_weights_mem is attached to my_own_buffer, so my_own_buffer should not be changed or descroyed until network execution completion.
auto conv1_weights_mem = memory::attach(conv1_weights_layout, my_own_buffer.data(), my_own_buffer.size());
// Create default engine
cldnn::engine engine;
// Create memory for conv1 bias
layout conv1_bias_layout(data_type, format::bfyx, spatial(20));
// Memory allocation requires engine
auto conv1_bias_mem = memory::allocate(engine, conv1_bias_layout);
// The memory is allocated by library, so we do not need to care about buffer lifetime.
copy_to_memory(conv1_bias_mem, load_data("conv1_bias.bin"));
// Get new topology
cldnn::topology topology = create_topology(in_layout, conv1_weights_mem, conv1_bias_mem);
// Define network data not defined in create_topology()
topology.add(
cldnn::data("fc1_weights", load_mem(engine, "fc1_weights.data")),
cldnn::data("fc1_bias", load_mem(engine, "fc1_bias.data")),
cldnn::data("fc2_weights", load_mem(engine, "fc2_weights.data")),
cldnn::data("fc2_bias", load_mem(engine, "fc2_bias.data"))
);
// Build the network. Allow implicit data optimizations.
// The "softmax" primitive is not used as an input for other primitives,
// so we do not need to explicitly select it in build_options::outputs()
cldnn::network network(engine, topology, { build_option::optimize_data(true) });
// Set network data which was not known at topology creation.
network.set_input_data("conv2_weights", load_mem(engine, "conv2_weights.data"));
network.set_input_data("conv2_bias", load_mem(engine, "conv2_bias.data"));
// Allocate memory for input image.
auto input_memory = memory::allocate(engine, in_layout);
// Run network 2 times with different images.
for (auto img_name : { "one.jpg", "two.jpg" })
{
// Reuse image memory.
copy_to_memory(input_memory, load_image_bfyx("one.jpg", in_layout.size.spatial[0], in_layout.size.spatial[1]));
auto result = recognize_image(network, input_memory);
cout << img_name << " recognized as" << result << endl;
}
return 0;

Errors in GPU device IDs

My GPU is HD630 but the list in
https://github.com/intel/clDNN/blob/master/src/caps/public/gpu_devices.inc
shows it is a HD620

( GEN_DEVICE(HD620, 0x3E92, HD6XX, GEN9, GT2 )

Also add 0x3E9B (as also reported in comment #47 )

I think it is better to review the complete list.

Cheers,

Nikos

Example of DQN and ALE

Hi guys,

Thank you for the great work to bring Intel GPU into DNN. I applauded this move to level up OpenCL into more DNN applications.

I'm looking for Deep Q-Network (DQN and Double DQN) where user could connect and link to Atari Learning Environment (ALE) in clDNN but couldn't find any similar example. By any chance if it is not available, would you consider to add this example in as ALE is a critical application in my DNN research and I would like to run DNN in Intel GPUs.

Enable the DL community to innovate on many integrated core coprocessors.

Greetings Developers,

Does there exist plans to optimize for Xeon Phi Coprocessors?
Is it possible to execute and test using MIC hardware?

Thanks,
Coast

ONNXIFI wrapper

Hi, cldnner, is there any in-progress work or plan for onnxifi support?

Linking library

I want to build the example code by linking the libclDNN64.so. How do I build it?

Support for HD Graphics 6000?

Is this planned to be supported? Alternatively, what work would need to be done to support it? We're looking at the i5-5250u because the HD Graphics 6000 seems to have good GFLOPS at a good price.

Image name variables are not be used correctly

The variable img_name is not used correctly but always "one.jpg" is used. There is the same problem in https://github.com/01org/clDNN/blob/6f5c9f231ae720c106670a153ab60469d5a6ff2f/tutorial/example_cldnn.cpp#L213.

Link broken

Following link at the home page of this project is broken
https://01org.github.io/clDNN/index.html

Building on ClearLinux Fails

Hi,
I'm trying to build the project on ClearLinux OS here are my environment details:
CMake version: 3.13.3
GCC version: gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502

Errors:

[ 53%] Built target api_test_builds
Scanning dependencies of target clDNN_shlib
[ 53%] Building CXX object src/CMakeFiles/clDNN_shlib.dir/graph_optimizer/add_required_reorders.cpp.o
In file included from /home/daniel/cldnn/src/include/layout_optimizer.h:31,
                 from /home/daniel/cldnn/src/include/pass_manager.h:21,
                 from /home/daniel/cldnn/src/graph_optimizer/add_required_reorders.cpp:21:
/home/daniel/cldnn/src/include/generic_layer.hpp: In constructor ‘cldnn::generic_layer::generic_layer(const dto*)’:
/home/daniel/cldnn/src/include/generic_layer.hpp:64:111: error: type qualifiers ignored on cast result type [-Werror=ignored-qualifiers]
         , generic_params(*static_cast<const kernel_selector::generic_kernel_params* const>(dto->generic_params))
                                                                                                               ^
cc1plus: all warnings being treated as errors
make[3]: *** [src/CMakeFiles/clDNN_shlib.dir/build.make:63: src/CMakeFiles/clDNN_shlib.dir/graph_optimizer/add_required_reorders.cpp.o] Error 1
make[2]: *** [CMakeFiles/Makefile2:92: src/CMakeFiles/clDNN_shlib.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:215: tests/CMakeFiles/tests.dir/rule] Error 2
make: *** [Makefile:190: tests] Error 2

any help will be deeply appreciated

Is there any case to verify depthwise convolution in CLDNN?

Hi, I noticed that you have implemented several depthwise convolution related cl kernels(see below). But I didn't see any related case to use these kernel.

Also, you mentioned the validated topologies include ": AlexNet*, VGG(16,19), GoogleNet(v1,v2,v3), ResNet(50,101,152)* Faster R-CNN*, Squeezenet*, SSD_googlenet*, SSD_VGG*, PVANET*, PVANET_REID*, age_gender*, FCN* and yolo*.", but not include mobilenet.

So does it mean the depthwise convolution cl kernels haven't been verified yet? If not, do you have any samples to share?

depthwise convolution related cl kernels:
https://github.com/intel/clDNN/blob/master/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_f16_depthwise.cl
https://github.com/intel/clDNN/blob/master/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_depthwise_weights_lwg.cl
https://github.com/intel/clDNN/blob/master/kernel_selector/core/cl_kernels/convolution_gpu_byxf_af32_depthwise.cl

FPGA compatibility

Is it possible to compile distributed opencl kernels into single bit stream file on FPGA? Any plans for extending this repo for intel FPGAs?

[QA] I'm curious about the formula for finding the output blockwidth

The formula for finding the output blockwidth in the ConvolutionKernel_bfyx_os_iyx_osv16.cpp file is shown.

    if (cp.stride.x == 1 && cp.stride.y == 1)
    {
        if (cp.filterSize.x == 1 && cp.filterSize.y == 1)
        {
            option.blockWidth = 16;
            option.blockHeight = 1;
            option.prefetch = 4;
        }
        //if less than 16 values is required to compute one single row of output
        //then each WI shall compute one single row to maximize reuse within SIMD subgroup (this gives very nice performance results)
        else if (params.output.X().v + (cp.filterSize.x - 1)*cp.dilation.x < sub_group_size)
        {
            option.blockWidth = params.output.X().v;
            option.blockHeight = 1;
            option.prefetch = 4;
        }
        else if (cp.filterSize.x < 5 && cp.filterSize.y < 5)
        {
            option.blockWidth = sub_group_size - cp.filterSize.x + 1;
            option.blockHeight = 2;
            option.prefetch = 4;
        }
        else
        {
            option.blockWidth = 4;
            option.blockHeight = 3;
            option.prefetch = 4;
        }
    }
    else if (cp.stride.x == 2 && cp.stride.y == 2)
    {
        option.blockWidth = 5;
        option.blockHeight = 4;
        option.prefetch = 4;
    }
    else
    {
        option.blockWidth = 4;
        option.blockHeight = 3;
        option.prefetch = 5;
        //run_info.effiency = FORCE_PRIORITY_7; // GEMM is better
    }

.

I wonder why the output blockWidth is 4 if the stride size is greater than 2. How can I calculate the output width?

Execute tests error

Hello,
my environment of hardware:
[ OK ] Processor name: Intel(R) Xeon(R) CPU E3-1585 v5 @ 3.50GHz
[ INFO ] Intel Processor
[ INFO ] Processor brand: Xeon
[ INFO ] Processor arch: Skylake

OS readiness checks:
[ INFO ] GPU PCI id : 193A
[ INFO ] GPU description: SKL SRV GT4e
[ OK ] GPU visible to OS
[ INFO ] no nomodeset in GRUB cmdline (good)
[ INFO ] Linux distro : Ubuntu 16.04
[ INFO ] Linux kernel : 4.13.0-32-generic
[ INFO ] glibc version : 2.23
[ INFO ] Linux distro suitable for Generic install
[ INFO ] gcc version : 20160609 (>=4.8.2 suggested)

Media Server Studio Install:
[ OK ] user in video group
[ ERROR ] libva.so.1 not found. Check LD_LIBRARY_PATH contains '/usr/lib64;/usr/local/lib'
[ ERROR ] libva not loading Intel iHD
[ ERROR ] vainfo not reporting codec entry points
[ INFO ] i915 driver in use by Intel video adapter
[ ERROR ] no libva include files. Are Intel components installed?

Component Smoke Tests:
[ ERROR ] no Media SDK include files. Are Intel components installed?
[ OK ] OpenCL check:platform:Intel(R) OpenCL GPU OK CPU OK
platform:Experimental OpenCL 2.1 CPU Only Platform GPU OK CPU OK

  When I execute the tests and tutorial, i get this following error:
  terminate called after throwing an instance of 'cldnn::error'
  what():  failed to create engine: Device lookup failed - unsupported device id: 0x193A. Note: HD5xx+ 
  devices are supported
  Aborted (core dumped)  
  tests/CMakeFiles/tests.dir/build.make:904: recipe for target 'out/Linux64/Release/tests64' failed 
  make[2]: *** [out/Linux64/Release/tests64] Error 134 
  make[2]: *** Deleting file 'out/Linux64/Release/tests64'
  CMakeFiles/Makefile2:197: recipe for target 'tests/CMakeFiles/tests.dir/all' failed  
  make[1]: *** [tests/CMakeFiles/tests.dir/all] Error 2 
  Makefile:129: recipe for target 'all' failed
  make: *** [all] Error 2

Thank you very much

[Question] If I have googlenet.prototxt, how can I convert to clDNN?

If I have googlenet.prototxt, how can I convert to clDNN?

Thanks a lot.

System error

Running openvino on Intel i5 9600k UHD graphics 630
But get "[ ERROR ] failed to create engine: Device lookup failed - unsupported device id: 0x3E98. Note: HD5xx+ devices are supported"

Is this error related to driver version?

corei7 8750H support?

I would like to use OpenVINO at corei7 8750H.
I added Device ID (0x3E9B) to gpu_devices.inc
cldnn's TESTS passed
However, even if you replace cldnn64.dll of OpenVINO, the program can not be executed

（Reference URL）
https://ark.intel.com/ja/products/134906/Intel-Core-i7-8750H-Processor-9M-Cache-up-to-4-10-GHz-

Profiling Event in Tutorial Chapter 5 returns 0 nanosecond

Hello,

I run the Chapter 5 in tutorial and get each kernel's timing. But the timing should be not correct.

fc:
submission:0nanoseconds
starting:0nanoseconds
executing:0nanoseconds
fc_bias:
submission:0nanoseconds
starting:0nanoseconds
executing:0nanoseconds
fc_weights:
submission:0nanoseconds
starting:0nanoseconds
executing:0nanoseconds
input:
submission:0nanoseconds
starting:0nanoseconds
executing:0nanoseconds
relu:
submission:0nanoseconds
starting:0nanoseconds
executing:0nanoseconds
softmax:
submission:0nanoseconds
starting:0nanoseconds
executing:0nanoseconds

My environment is Ubuntu 14.04 with Beignet driver.
The device information is as the following.

Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.2 beignet 1.3 (git-8bd8c3a)
Platform Name: Intel Gen OCL Driver
Platform Vendor: Intel
Platform Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short

Platform Name: Intel Gen OCL Driver
Number of devices: 1
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 32902
Max compute units: 72
Max work items dimensions: 3
Max work items[0]: 512
Max work items[1]: 512
Max work items[2]: 512
Max work group size: 512
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 4
Preferred vector width double: 0
Native vector width char: 8
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 4
Native vector width double: 2
Max clock frequency: 1000Mhz
Address bits: 32
Max memory allocation: 3221225472
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 8192
Max image 3D height: 8192
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: No
Round to +ve and infinity: No
IEEE754-2008 fused multiply-add: No
Cache type: Read/Write
Cache line size: 64
Cache size: 8192
Global memory size: 4294967296
Constant buffer size: 134217728
Max number of constant args: 8
Local memory type: Local
Local memory size: 65536
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 80
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 0x7fc8d642ebe0
Name: Intel(R) HD Graphics Skylake Server GT4
Vendor: Intel
Device OpenCL C version: OpenCL C 1.2 beignet 1.3 (git-8bd8c3a)
Driver version: 1.3
Profile: FULL_PROFILE
Version: OpenCL 1.2 beignet 1.3 (git-8bd8c3a)
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_fp16

Could you identify the issue and give me some suggestions?

Best Regards

many disabled tests

I am using the following variables to enable some tests:
-DCLDNN__RUN_TESTS:BOOL=ON -DCLDNN__INCLUDE_TESTS:BOOL=ON

About 125 tests pass, but there is a warning that more than 1000 tests have been disabled. What is the reason?

build fail

Build fail log are below. Solved by adding #include "cmath" in src/gpu/kernel.h

[patch]
diff --git a/src/gpu/kernel.h b/src/gpu/kernel.h
index 5a89e4e..b6ce0a5 100644
--- a/src/gpu/kernel.h
+++ b/src/gpu/kernel.h
@@ -25,6 +25,7 @@

#include
#include
+#include

namespace neural { namespace gpu {

[log]
[ 1%] Building CXX object src/CMakeFiles/clDNN_shlib.dir/network.cpp.o
In file included from /home/cv/cldnn/src/network.cpp:29:0:
/home/cv/cldnn/src/gpu/kernel.h: In function ‘std::string neural::gpu::to_code_string(T) [with T = float; std::string = std::basic_string]’:
/home/cv/cldnn/src/gpu/kernel.h:69:9: error: ‘isinf’ is not a member of ‘std’
if (std::isinf(val))
^
/home/cv/cldnn/src/gpu/kernel.h:70:61: error: ‘signbit’ is not a member of ‘std’
std::snprintf(buffer, sizeof(buffer), "%sINFINITY", std::signbit(val) ? "-" : "");
^
/home/cv/cldnn/src/gpu/kernel.h: In function ‘std::string neural::gpu::to_code_string(T) [with T = double; std::string = std::basic_string]’:
/home/cv/cldnn/src/gpu/kernel.h:80:9: error: ‘isinf’ is not a member of ‘std’
if (std::isinf(val))
^
/home/cv/cldnn/src/gpu/kernel.h:81:61: error: ‘signbit’ is not a member of ‘std’
std::snprintf(buffer, sizeof(buffer), "%sINFINITY", std::signbit(val) ? "-" : "");
^
make[2]: *** [src/CMakeFiles/clDNN_shlib.dir/network.cpp.o] Error 1
make[1]: *** [src/CMakeFiles/clDNN_shlib.dir/all] Error 2
make: *** [all] Error 2

System Requirements

It seems like the "System Requirements" on the main github page is a bit misleading. It says that clDNN supports Intel® HD Graphics and Intel® Iris® Graphics and is optimized for Skylake and Apollolake, when in fact it does not support anything older than gen 5.

The following exception is thrown when attempting to create an engine on an i7-4712HQ:
Device lookup failed - unsupported device id: 0x416. Note: HD5xx+ devices are supported

Are there any plans to support HD4 and older ?

clDNN Make errors

After running this command to build:

cmake -E make_directory build && cd build && cmake -DCMAKE_BUILD_TYPE=Release .. && make

the following error occur, anyone have any idea to solve this? The log is at the bottom.

/home/up2/cldnn/kernel_selector/common/tensor_type.cpp: In member function ‘KernelSelector::Tensor::DataTensor KernelSelector::Tensor::DataTensor::FlattenFeatureAndSpatials() const’:
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:128:30: error: this statement may fall through [-Werror=implicit-fallthrough=]
targetLayout = Tensor::fb;
~~~~~~~~~~~~~^~~~~~~~
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:129:13: note: here
case Tensor::bfyx:
^~~~
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:137:30: error: this statement may fall through [-Werror=implicit-fallthrough=]
targetLayout = Tensor::fb;
~~~~~~~~~~~~~^~~~~~~~
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:138:13: note: here
case Tensor::byxf:
^~~~
cc1plus: all warnings being treated as errors
make[2]: *** [kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/common/tensor_type.cpp.o] Error 1
make[1]: *** [kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/all] Error 2
make: *** [all] Error 2

----------------------------------------------------Log-----------------------------------------------------------

cmake -E make_directory build && cd build && cmake -DCMAKE_BUILD_TYPE=Release .. && make
-- The C compiler identification is GNU 7.2.1
-- The CXX compiler identification is GNU 7.2.1
-- Check for working C compiler: /opt/rh/devtoolset-7/root/usr/bin/cc
-- Check for working C compiler: /opt/rh/devtoolset-7/root/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/rh/devtoolset-7/root/usr/bin/c++
-- Check for working CXX compiler: /opt/rh/devtoolset-7/root/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
[clDNN] CLDNN__ARCHITECTURE_TARGET: Target architecture is not specified. Trying to deduce it from context.
-- Found PythonInterp: /usr/bin/python2.7 (found suitable version "2.7.5", minimum required is "2.7")
-- Boost version: 1.64.0
-- Found the following Boost libraries:
-- system
-- date_time
-- program_options
-- filesystem
-- [clDNN] ======================== clDNN Project =======================
-- [clDNN] Version: 1.3.8.0
-- [clDNN]
-- [clDNN] Build type: Release (for single-configuration generators)
-- [clDNN] Av. build types: Debug;Release (for multi-configuration generators)
-- [clDNN]
-- [clDNN] Output bin directory:
-- [clDNN] - "/home/up2/cldnn/build/out/Linux64/Release"
-- [clDNN] Output lib directory:
-- [clDNN] - "/home/up2/cldnn/build/out/Linux64/Release"
-- [clDNN] Architecture:
-- [clDNN] - target: Linux64 (detected: Linux64)
-- [clDNN]
-- [clDNN]
-- [clDNN] Advanced:
-- [clDNN] - ICD version used to build: 6.3
-- [clDNN] - boost ver. used to build: 1.64.0
-- [clDNN]
-- [clDNN] - Include/Build cldnn core: ON
-- [clDNN] - Include/Build kernel selector: ON
-- [clDNN] - Include/Build tests: ON
-- [clDNN] - Include/Build tutorial: ON
-- [clDNN]
-- [clDNN] - Run tests: OFF
-- [clDNN]
-- [clDNN] - Use static C++ Runtime: OFF
-- [clDNN] - Allow unsafe size opts: ON
-- [clDNN] - CMake debug trace: OFF
-- [clDNN]
-- [clDNN]
-- [clDNN] ICD:
-- [clDNN] - Root: /home/up2/cldnn/common/intel_ocl_icd/6.3
-- [clDNN] + Headers: /home/up2/cldnn/common/intel_ocl_icd/6.3/linux/include
-- [clDNN] + Static libs: /home/up2/cldnn/common/intel_ocl_icd/6.3/linux/Release/lib/x64
-- [clDNN] + Shared libs: /home/up2/cldnn/common/intel_ocl_icd/6.3/linux/Release/bin/x64
-- [clDNN] + Libs to link: /home/up2/cldnn/common/intel_ocl_icd/6.3/linux/Release/bin/x64
-- [clDNN]
-- [clDNN] boost libraries:
-- [clDNN] - Root: /home/up2/cldnn/common/boost/1.64.0
-- [clDNN] + Headers: /home/up2/cldnn/common/boost/1.64.0/include/boost-1_64
-- [clDNN] + Libs to link: /home/up2/cldnn/common/boost/1.64.0/linux/x64/lib
-- [clDNN] =============================================================================
-- Performing Test CLDNN__COMPILER_SUPPORTS_CXX14
-- Performing Test CLDNN__COMPILER_SUPPORTS_CXX14 - Success
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- [clDNN] Selected capabilities: public
-- Configuring done
-- Generating done
-- Build files have been written to: /home/up2/cldnn/build
[ 0%] Generating ks_primitive_db.inc ...
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/activation_opt.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/activation_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/activation_tutorial.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/concatenation_gpu_depth_bfyx_no_pitch.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/concatenation_gpu_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_1x1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_1x1_hgemm_buf_16x1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_3x3_dw_opt.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_direct_10_12_16.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_direct_8_8_16.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_gemm_like_fp16.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_gemm_like_fp32.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_os_iyx_osv16.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_bfyx_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_winograd_2x3_s1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_winograd_2x3_s1_fused.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_yxfb_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_yxfb_yxio_b16_fp16.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_yxfb_yxio_b16_fp32.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_yxfb_yxio_b1_block_fp32.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_yxfb_yxio_b1_block_multiple_x_fp32.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_gpu_yxfb_yxio_b8_fp32.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/convolution_tutorial.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/deconvolution_gpu_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/eltwise_simple_vload8.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_bf_io_gemm.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_bf_io_input_spatial.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_bf_io_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_bfyx_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_bs_f_bsv16_af8_vload.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_bs_f_bsv16_b1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_bs_f_bsv8_af8_vload.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_fb_io_b8_f8.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_fb_io_b8_f8_vload.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_fb_io_block_fp16.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_fb_io_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_fb_oi_b8_fp32_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_fb_oi_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_image_tutorial.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/fully_connected_gpu_yxfb_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/generic_eltwise_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/lrn_gpu_across_channel_multiple_features.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/lrn_gpu_across_channel_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/lrn_gpu_across_channel_yxfb_b8_opt.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/lrn_gpu_within_channel.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/lrn_gpu_within_channel_opt.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/lrn_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/normalize_gpu_across_spatial_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/normalize_gpu_within_spatial_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/permute_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/pooling_gpu_average_opt.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/pooling_gpu_bfyx_block_opt.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/pooling_gpu_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/region_yolo_gpu_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorder_data.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorder_data_fast_b1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorder_from_winograd_2x3_s1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorder_to_winograd_2x3_s1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorder_weights.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorder_weights_image_2d_c4_fyx_b.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorder_weights_winograd_2x3_s1.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reorg_yolo_gpu_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/reshape_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/roi_pooling_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/softmax_gpu_bf.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/softmax_gpu_fb.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/softmax_gpu_items_class_optimized.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/softmax_gpu_ref.cl
processing /home/up2/cldnn/kernel_selector/core/cl_kernels/upsampling_ref.cl
[ 1%] Updating file if the file changed (ks_primitive_db.inc) ...
Scanning dependencies of target cldnn_kernel_selector
[ 2%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/auto_tuner.cpp.o
[ 2%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/auto_tuner_offline.cpp.o
[ 2%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/kernel_base.cpp.o
[ 3%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/kernel_selector.cpp.o
[ 3%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/kernel_selector_common.cpp.o
[ 4%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/kernel_selector_params.cpp.o
[ 4%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/common/tensor_type.cpp.o
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp: In member function ‘KernelSelector::Tensor::DataTensor KernelSelector::Tensor::DataTensor::FlattenFeatureAndSpatials() const’:
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:128:30: error: this statement may fall through [-Werror=implicit-fallthrough=]
targetLayout = Tensor::fb;
~~~~~~~~~~~~~^~~~~~~~
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:129:13: note: here
case Tensor::bfyx:
^~~~
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:137:30: error: this statement may fall through [-Werror=implicit-fallthrough=]
targetLayout = Tensor::fb;
~~~~~~~~~~~~~^~~~~~~~
/home/up2/cldnn/kernel_selector/common/tensor_type.cpp:138:13: note: here
case Tensor::byxf:
^~~~
cc1plus: all warnings being treated as errors
make[2]: *** [kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/common/tensor_type.cpp.o] Error 1
make[1]: *** [kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/all] Error 2
make: *** [all] Error 2

inception_v3 model inference with bad performance

Do you guys have perf data for classical models on clDNN?
I ran inception_v3 model with intel inference engine(with clDNN plugin)，it takes 600+ms to complete, which is no better than tensorflow inference on CPU. Here is the my data:

InferenceEngine:
API version ............ 1.0
Build .................. 5852
[ INFO ] Parsing input parameters
[ INFO ] No extensions provided
[ INFO ] Loading plugin

API version ............ 0.1
Build .................. prod-02709
Description ....... clDNNPlugin

[ INFO ] Loading network files
[ INFO ] Preparing input blobs
[ INFO ] Batch size is 1
[ INFO ] Preparing output blobs
[ INFO ] Loading model to the plugin
[ INFO ] Start inference (50 iterations)

Average running time of one iteration: 624.855 ms

Perfomance counts:

InceptionV3/InceptionV3/Conv2d_1a_3x3/BatchNorm/batchnorm/mul:EXECUTED layerType: Convolution InceptionV3/InceptionV3/Conv2d_1a_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Conv2d_2a_3x3/BatchNorm/batchnorm/mul:EXECUTED layerType: Convolution InceptionV3/InceptionV3/Conv2d_2a_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Conv2d_2b_3x3/BatchNorm/batchnorm/mul:EXECUTED layerType: Convolution InceptionV3/InceptionV3/Conv2d_2b_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Conv2d_3b_1x1/BatchNorm/batchnorm/mul:EXECUTED layerType: Convolution InceptionV3/InceptionV3/Conv2d_3b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Conv2d_4a_3x3/BatchNorm/batchnorm/mul:EXECUTED layerType: Convolution InceptionV3/InceptionV3/Conv2d_4a_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/MaxPool_3a_3x3/MaxPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/MaxPool_5a_3x3/MaxPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_5b/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5b/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5b/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5b/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5b/Branch_1/Conv2d_0b_5x5/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5b/Branch_1/Conv2d_0b_5x5/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5b/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5b/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5b/Branch_2/Conv2d_0b_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5b/Branch_2/Conv2d_0b_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5b/Branch_2/Conv2d_0c_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5b/Branch_2/Conv2d_0c_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5b/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_5b/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5b/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5b/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_5c/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5c/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5c/Branch_1/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5c/Branch_1/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5c/Branch_1/Conv_1_0c_5x5/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5c/Branch_1/Conv_1_0c_5x5/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5c/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5c/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5c/Branch_2/Conv2d_0b_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5c/Branch_2/Conv2d_0b_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5c/Branch_2/Conv2d_0c_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5c/Branch_2/Conv2d_0c_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5c/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_5c/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5c/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5c/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_5d/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5d/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5d/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5d/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5d/Branch_1/Conv2d_0b_5x5/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5d/Branch_1/Conv2d_0b_5x5/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5d/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5d/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5d/Branch_2/Conv2d_0b_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5d/Branch_2/Conv2d_0b_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5d/Branch_2/Conv2d_0c_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5d/Branch_2/Conv2d_0c_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5d/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_5d/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_5d/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_5d/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_6a/Branch_0/Conv2d_1a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6a/Branch_0/Conv2d_1a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6a/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6a/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6a/Branch_1/Conv2d_0b_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6a/Branch_1/Conv2d_0b_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6a/Branch_1/Conv2d_1a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6a/Branch_1/Conv2d_1a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6a/Branch_2/MaxPool_1a_3x3/MaxPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_6a/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_6b/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_1/Conv2d_0b_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_1/Conv2d_0b_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_1/Conv2d_0c_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_1/Conv2d_0c_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0b_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0b_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0c_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0c_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0d_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0d_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0e_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_2/Conv2d_0e_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_6b/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6b/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6b/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_6c/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_1/Conv2d_0b_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_1/Conv2d_0b_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_1/Conv2d_0c_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_1/Conv2d_0c_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0b_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0b_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0c_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0c_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0d_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0d_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0e_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_2/Conv2d_0e_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_6c/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6c/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6c/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_6d/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_1/Conv2d_0b_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_1/Conv2d_0b_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_1/Conv2d_0c_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_1/Conv2d_0c_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0b_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0b_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0c_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0c_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0d_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0d_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0e_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_2/Conv2d_0e_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_6d/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6d/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6d/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_6e/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_1/Conv2d_0b_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_1/Conv2d_0b_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_1/Conv2d_0c_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_1/Conv2d_0c_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0b_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0b_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0c_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0c_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0d_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0d_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0e_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_2/Conv2d_0e_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_6e/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_6e/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_6e/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_7a/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7a/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7a/Branch_0/Conv2d_1a_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7a/Branch_0/Conv2d_1a_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_0b_1x7/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_0b_1x7/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_0c_7x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_0c_7x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_1a_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7a/Branch_1/Conv2d_1a_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7a/Branch_2/MaxPool_1a_3x3/MaxPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_7a/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_7b/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_1/Conv2d_0b_1x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_1/Conv2d_0b_1x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_1/Conv2d_0b_3x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_1/Conv2d_0b_3x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_1/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0b_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0b_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0c_1x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0c_1x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0d_3x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_2/Conv2d_0d_3x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/Branch_2/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_7b/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_7b/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7b/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7b/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_7c/Branch_0/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_0/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_1/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_1/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_1/Conv2d_0b_1x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_1/Conv2d_0b_1x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_1/Conv2d_0c_3x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_1/Conv2d_0c_3x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_1/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0a_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0a_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0b_3x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0b_3x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0c_1x3/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0c_1x3/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0d_3x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_2/Conv2d_0d_3x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/Branch_2/concat:EXECUTED layerType: Concat InceptionV3/InceptionV3/Mixed_7c/Branch_3/AvgPool_0a_3x3/AvgPool:EXECUTED layerType: Pooling InceptionV3/InceptionV3/Mixed_7c/Branch_3/Conv2d_0b_1x1/BatchNorm/batchnorm/mul:EXECUTED InceptionV3/InceptionV3/Mixed_7c/Branch_3/Conv2d_0b_1x1/Relu:OPTIMIZED_OUT layerType: ReLU InceptionV3/InceptionV3/Mixed_7c/concat:EXECUTED layerType: Concat InceptionV3/Logits/AvgPool_1a_8x8/AvgPool:EXECUTED layerType: Pooling InceptionV3/Logits/Conv2d_1c_1x1/convolution:EXECUTED layerType: Convolution InceptionV3/Logits/SpatialSqueeze:EXECUTED layerType: Reshape realTime: 15174 InceptionV3/Predictions/Reshape:EXECUTED layerType: Reshape realTime: 15174 InceptionV3/Predictions/Reshape_1:EXECUTED layerType: Reshape realTime: 32 InceptionV3/Predictions/Reshape_1_cldnn_output_postprocess:EXECUTED layerType: Reorder InceptionV3/Predictions/Softmax:EXECUTED layerType: SoftMax realTime: 32 input_cldnn_input_preprocess: EXECUTED layerType: Reorder realTime: 1211 scale: NOT_RUN layerType: Power realTime: 0 Total time: 645521 microseconds
[ INFO ] Processing output blobs realTime: 12069 cpu: 598 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 15137 cpu: 526 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 29440 cpu: 455 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 7561 cpu: 300 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 33730 cpu: 230 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 2311 cpu: 391 execType: GPU
realTime: 1671 cpu: 179 execType: GPU
layerType: Convolution realTime: 2375 cpu: 219 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 1834 cpu: 392 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5907 cpu: 306 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2413 cpu: 655 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3698 cpu: 568 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4711 cpu: 480 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 2182 cpu: 819 execType: GPU
layerType: Convolution realTime: 1211 cpu: 779 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 40 cpu: 570 execType: GPU
layerType: Convolution realTime: 2914 cpu: 441 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2186 cpu: 623 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5912 cpu: 531 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2892 cpu: 294 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3685 cpu: 208 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4769 cpu: 697 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 2813 cpu: 488 execType: GPU
layerType: Convolution realTime: 2868 cpu: 382 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 71 cpu: 167 execType: GPU
layerType: Convolution realTime: 3122 cpu: 181 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2345 cpu: 358 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5956 cpu: 275 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3156 cpu: 657 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3694 cpu: 534 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4721 cpu: 453 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 3132 cpu: 812 execType: GPU
layerType: Convolution realTime: 3102 cpu: 743 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 71 cpu: 340 execType: GPU
layerType: Convolution realTime: 24400 cpu: 419 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3135 cpu: 220 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3665 cpu: 163 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2318 cpu: 106 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 745 cpu: 289 execType: GPU
realTime: 116 cpu: 325 execType: GPU
layerType: Convolution realTime: 6320 cpu: 174 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4141 cpu: 338 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2883 cpu: 281 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 7669 cpu: 232 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4142 cpu: 164 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5154 cpu: 105 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3091 cpu: 526 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5169 cpu: 482 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4263 cpu: 429 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 2095 cpu: 279 execType: GPU
layerType: Convolution realTime: 6312 cpu: 223 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 54 cpu: 457 execType: GPU
layerType: Convolution realTime: 6532 cpu: 226 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5200 cpu: 384 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4348 cpu: 329 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 9439 cpu: 277 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5565 cpu: 223 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 8005 cpu: 170 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4347 cpu: 115 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 8015 cpu: 477 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5126 cpu: 441 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 2143 cpu: 395 execType: GPU
layerType: Convolution realTime: 6431 cpu: 281 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 74 cpu: 428 execType: GPU
layerType: Convolution realTime: 6293 cpu: 213 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5170 cpu: 377 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4362 cpu: 326 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 9445 cpu: 266 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5196 cpu: 249 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 7879 cpu: 198 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4372 cpu: 197 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 7889 cpu: 151 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 5217 cpu: 414 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 2163 cpu: 365 execType: GPU
layerType: Convolution realTime: 6348 cpu: 303 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 72 cpu: 476 execType: GPU
layerType: Convolution realTime: 6243 cpu: 249 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6411 cpu: 432 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6050 cpu: 356 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 11285 cpu: 301 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6293 cpu: 324 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 11011 cpu: 274 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6066 cpu: 210 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 11170 cpu: 159 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6037 cpu: 106 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 2175 cpu: 440 execType: GPU
layerType: Convolution realTime: 6242 cpu: 380 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 49 cpu: 98 execType: GPU
layerType: Convolution realTime: 6297 cpu: 204 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 3030 cpu: 163 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6316 cpu: 474 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6024 cpu: 384 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 11242 cpu: 323 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2440 cpu: 245 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 580 cpu: 509 execType: GPU
realTime: 48 cpu: 419 execType: GPU
layerType: Convolution realTime: 3312 cpu: 125 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4425 cpu: 315 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2448 cpu: 233 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4386 cpu: 272 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 26 cpu: 171 execType: GPU
layerType: Convolution realTime: 4623 cpu: 301 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 8054 cpu: 247 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2454 cpu: 140 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4342 cpu: 191 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 28 cpu: 379 execType: GPU
realTime: 1040 cpu: 369 execType: GPU
layerType: Convolution realTime: 2452 cpu: 351 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 18 cpu: 324 execType: GPU
layerType: Convolution realTime: 5353 cpu: 302 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 6703 cpu: 264 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2454 cpu: 163 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4341 cpu: 205 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 27 cpu: 342 execType: GPU
layerType: Convolution realTime: 7212 cpu: 158 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 7973 cpu: 112 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 2475 cpu: 372 execType: GPU
realTime: 0 cpu: 0 execType: None
layerType: Convolution realTime: 4524 cpu: 399 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 26 cpu: 314 execType: GPU
realTime: 1547 cpu: 281 execType: GPU
layerType: Convolution realTime: 3882 cpu: 224 execType: GPU
realTime: 0 cpu: 0 execType: None
realTime: 17 cpu: 187 execType: GPU
realTime: 565 cpu: 149 execType: GPU
realTime: 15174 cpu: 98 execType: GPU
cpu: 98 execType: GPU
cpu: 98 execType: GPU
cpu: 798 execType: GPU
realTime: 6 cpu: 784 execType: GPU
cpu: 798 execType: GPU
cpu: 656 execType: GPU
cpu: 0 execType: None

Top 10 results:

Image ./grace_hopper_299.bmp

715 1.0000000 label #715
111 0.0000000 label #111
711 0.0000000 label #711
917 0.0000000 label #917
949 0.0000000 label #949
503 0.0000000 label #503
983 0.0000000 label #983
853 0.0000000 label #853
35 0.0000000 label #35
615 0.0000000 label #615

[ INFO ] Execution successfull

Fail to build with MINGW64 compiler within MSYS2 environment

cmake .. -G "MSYS Makefiles" -DCMAKE_BUILD_TYPE=Release .. && make

DL@2030006696-SOH MINGW64 ~/cldnn/build
$ cmake .. -G "MSYS Makefiles" -DCMAKE_BUILD_TYPE=Release .. && make
-- The C compiler identification is GNU 8.2.1
-- The CXX compiler identification is GNU 8.2.1
-- Check for working C compiler: C:/msys64/mingw64/bin/gcc.exe
-- Check for working C compiler: C:/msys64/mingw64/bin/gcc.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: C:/msys64/mingw64/bin/g++.exe
-- Check for working CXX compiler: C:/msys64/mingw64/bin/g++.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
[clDNN] CLDNN__ARCHITECTURE_TARGET: Target architecture is not specified. Trying to deduce it from context.
-- Found PythonInterp: C:/msys64/usr/bin/python2.7.exe (found suitable version "2.7.15", minimum required is "2.7")
-- Boost version: 1.64.0
-- Found the following Boost libraries:
-- system
-- date_time
-- program_options
-- filesystem
-- [clDNN] ======================== clDNN Project =======================
-- [clDNN] Version: 1.4.22.0
-- [clDNN]
-- [clDNN] Build type: Release (for single-configuration generators)
-- [clDNN] Av. build types: Debug;Release (for multi-configuration generators)
-- [clDNN]
-- [clDNN] Output bin directory:
-- [clDNN] - "C:/msys64/home/DL/cldnn/build/out/Windows32/Release"
-- [clDNN] Output lib directory:
-- [clDNN] - "C:/msys64/home/DL/cldnn/build/out/Windows32/Release"
-- [clDNN] Architecture:
-- [clDNN] - target: Windows32 (detected: Windows32)
-- [clDNN]
-- [clDNN]
-- [clDNN] Advanced:
-- [clDNN] - ICD version used to build: 6.3
-- [clDNN] - boost ver. used to build: 1.64.0
-- [clDNN]
-- [clDNN] - Include/Build cldnn core: ON
-- [clDNN] - Include/Build kernel selector: ON
-- [clDNN] - Include/Build tests: ON
-- [clDNN] - Include/Build core internal tests: ON
-- [clDNN] - Include/Build tutorial: ON
-- [clDNN]
-- [clDNN] - Run tests: OFF
-- [clDNN] - Run core internal tests: OFF
-- [clDNN]
-- [clDNN] - Use static C++ Runtime: OFF
-- [clDNN] - Allow unsafe size opts: ON
-- [clDNN] - CMake debug trace: OFF
-- [clDNN]
-- [clDNN]
-- [clDNN] ICD:
-- [clDNN] - Root: C:/msys64/home/DL/cldnn/common/intel_ocl_icd/6.3
-- [clDNN] + Headers: C:/msys64/home/DL/cldnn/common/intel_ocl_icd/6.3/windows/include
-- [clDNN] + Static libs: C:/msys64/home/DL/cldnn/common/intel_ocl_icd/6.3/windows/Release/lib/x86
-- [clDNN] + Shared libs: C:/msys64/home/DL/cldnn/common/intel_ocl_icd/6.3/windows/Release/bin/x86
-- [clDNN] + Libs to link: C:/msys64/home/DL/cldnn/common/intel_ocl_icd/6.3/windows/Release/lib/x86
-- [clDNN]
-- [clDNN] boost libraries:
-- [clDNN] - Root: C:/msys64/home/DL/cldnn/common/boost/1.64.0
-- [clDNN] + Headers: C:/msys64/home/DL/cldnn/common/boost/1.64.0/include/boost-1_64
-- [clDNN] + Libs to link: C:/msys64/home/DL/cldnn/common/boost/1.64.0/windows/x86/lib
-- [clDNN] =============================================================================
-- Performing Test CLDNN__COMPILER_SUPPORTS_CXX14
-- Performing Test CLDNN__COMPILER_SUPPORTS_CXX14 - Success
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- [clDNN] Selected capabilities: public
-- Found OpenMP_C: -fopenmp
-- Found OpenMP_CXX: -fopenmp
-- Found OpenMP: TRUE
-- [clDNN] Selected capabilities: public
-- Configuring done
-- Generating done
-- Build files have been written to: C:/msys64/home/DL/cldnn/build
[ 0%] Generating ks_primitive_db.inc ...
processing C:/msys64/home/DL/cldnn/kernel_selector/core/cl_kernels/activation_opt.cl
processing C:/msys64/home/DL/cldnn/kernel_selector/core/cl_kernels/activation_ref.cl
processing C:/msys64/home/DL/cldnn/kernel_selector/core/cl_kernels/activation_tutorial.cl
processing C:/msys64/home/DL/cldnn/kernel_selector/core/cl_kernels/arg_max_min_axis.cl
processing C:/msys64/home/DL/cldnn/kernel_selector/core/cl_kernels/arg_max_min_gpu_ref.cl

..
..
..
..
..

[ 35%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/softmax/softmax_kernel_fb.cpp.obj
[ 36%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/softmax/softmax_kernel_items_class_optimized.cpp.obj
[ 36%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/softmax/softmax_kernel_ref.cpp.obj
[ 36%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/softmax/softmax_kernel_selector.cpp.obj
[ 36%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/softmax_loss_grad/softmax_loss_grad_kernel_base.cpp.obj
[ 36%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/softmax_loss_grad/softmax_loss_grad_kernel_ref.cpp.obj
[ 36%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/softmax_loss_grad/softmax_loss_grad_kernel_selector.cpp.obj
[ 37%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/tile/tile_kernel_ref.cpp.obj
[ 37%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/tile/tile_kernel_selector.cpp.obj
[ 37%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/upsampling/upsampling_kernel_base.cpp.obj
[ 37%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/upsampling/upsampling_kernel_ref.cpp.obj
[ 37%] Building CXX object kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/core/actual_kernels/upsampling/upsampling_kernel_selector.cpp.obj
[ 37%] Linking CXX static library ../out/Windows32/Release/libcldnn_kernel_selector32.a
Error copying file (if different) from "C:/msys64/home/DL/cldnn/kernel_selector/core/cache/cache.json" to "C:/msys64/home/DL/cldnn/build/out/Windows32/Release/Release/".
make[2]: *** [kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/build.make:3965: out/Windows32/Release/libcldnn_kernel_selector32.a] Error 1
make[2]: *** Deleting file 'out/Windows32/Release/libcldnn_kernel_selector32.a'
make[1]: *** [CMakeFiles/Makefile2:313: kernel_selector/CMakeFiles/cldnn_kernel_selector.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

clDNN makes [bad] assumptions about c++ runtime

The check checks for clang and if true, tries to link against libc++, etc... without any checks that this is the correct runtime to be using.

https://github.com/intel/clDNN/blob/master/CMakeLists.txt#L1055

cmake -DCMAKE_BUILD_TYPE=Release .. Error

Hello,
My OS is Ubuntu 16.04.
cmake version is 3.7.2.
intel graphic driver is SRB5
intel opencl sdk is 1.2-7.0
When i run cmake -DCMAKE_BUILD_TYPE=Release .., i get this following error:
**-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
[clDNN] CLDNN__ARCHITECTURE_TARGET: Target architecture is not specified. Trying to deduce it from context.
-- Found PythonInterp: /usr/bin/python2.7 (found suitable version "2.7.12", minimum required is "2.7")
CMake Warning at /usr/local/share/cmake-3.7/Modules/FindBoost.cmake:761 (message):
Imported targets not available for Boost version 106400
Call Stack (most recent call first):
/usr/local/share/cmake-3.7/Modules/FindBoost.cmake:865 (_Boost_COMPONENT_DEPENDENCIES)
/usr/local/share/cmake-3.7/Modules/FindBoost.cmake:1454 (_Boost_MISSING_DEPENDENCIES)
CMakeLists.txt:596 (find_package)

CMake Warning at /usr/local/share/cmake-3.7/Modules/FindBoost.cmake:761 (message):
Imported targets not available for Boost version 106400
Call Stack (most recent call first):
/usr/local/share/cmake-3.7/Modules/FindBoost.cmake:865 (_Boost_COMPONENT_DEPENDENCIES)
/usr/local/share/cmake-3.7/Modules/FindBoost.cmake:1454 (_Boost_MISSING_DEPENDENCIES)
CMakeLists.txt:596 (find_package)

-- Boost version: 1.64.0
-- Found the following Boost libraries:
-- system
-- date_time
-- program_options
-- filesystem
-- [clDNN] ======================== clDNN Project =======================
-- [clDNN] Version: 1.3.8.0
-- [clDNN]
-- [clDNN] Build type: Release (for single-configuration generators)
-- [clDNN] Av. build types: Debug;Release (for multi-configuration generators)
-- [clDNN]
-- [clDNN] Output bin directory:
-- [clDNN] - "/home/user1/cldnn/build/out/Linux64/Release"
-- [clDNN] Output lib directory:
-- [clDNN] - "/home/user1/cldnn/build/out/Linux64/Release"
-- [clDNN] Architecture:
-- [clDNN] - target: Linux64 (detected: Linux64)
-- [clDNN]
-- [clDNN]
-- [clDNN] Advanced:
-- [clDNN] - ICD version used to build: 6.3
-- [clDNN] - boost ver. used to build: 1.64.0
-- [clDNN]
-- [clDNN] - Include/Build cldnn core: ON
-- [clDNN] - Include/Build kernel selector: ON
-- [clDNN] - Include/Build tests: ON
-- [clDNN] - Include/Build tutorial: ON
-- [clDNN]
-- [clDNN] - Run tests: OFF
-- [clDNN]
-- [clDNN] - Use static C++ Runtime: OFF
-- [clDNN] - Allow unsafe size opts: ON
-- [clDNN] - CMake debug trace: OFF
-- [clDNN]
-- [clDNN]
-- [clDNN] ICD:
-- [clDNN] - Root: /home/user1/cldnn/common/intel_ocl_icd/6.3
-- [clDNN] + Headers: /home/user1/cldnn/common/intel_ocl_icd/6.3/linux/include
-- [clDNN] + Static libs: /home/user1/cldnn/common/intel_ocl_icd/6.3/linux/Release/lib/x64
-- [clDNN] + Shared libs: /home/user1/cldnn/common/intel_ocl_icd/6.3/linux/Release/bin/x64
-- [clDNN] + Libs to link: /home/user1/cldnn/common/intel_ocl_icd/6.3/linux/Release/bin/x64
-- [clDNN]
-- [clDNN] boost libraries:
-- [clDNN] - Root: /home/user1/cldnn/common/boost/1.64.0
-- [clDNN] + Headers: /home/user1/cldnn/common/boost/1.64.0/include/boost-1_64
-- [clDNN] + Libs to link: /home/user1/cldnn/common/boost/1.64.0/linux/x64/lib
-- [clDNN] =============================================================================
-- Performing Test CLDNN__COMPILER_SUPPORTS_CXX14
-- Performing Test CLDNN__COMPILER_SUPPORTS_CXX14 - Success
-- Try OpenMP C flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Try OpenMP CXX flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Found OpenMP: -fopenmp
-- [clDNN] Selected capabilities: public
-- Configuring done
CMake Error at src/CMakeLists.txt:191 (add_library):
Target "clDNN_shlib" links to target "Boost::filesystem" but the target was
not found. Perhaps a find_package() call is missing for an IMPORTED
target, or an ALIAS target is missing?

CMake Error at src/CMakeLists.txt:191 (add_library):
Target "clDNN_shlib" links to target "Boost::system" but the target was not
found. Perhaps a find_package() call is missing for an IMPORTED target, or
an ALIAS target is missing?

CMake Error at tests/CMakeLists.txt:123 (add_executable):
Target "tests" links to target "Boost::filesystem" but the target was not
found. Perhaps a find_package() call is missing for an IMPORTED target, or
an ALIAS target is missing?

CMake Error at tests/CMakeLists.txt:123 (add_executable):
Target "tests" links to target "Boost::system" but the target was not
found. Perhaps a find_package() call is missing for an IMPORTED target, or
an ALIAS target is missing?

CMake Error at tutorial/CMakeLists.txt:60 (add_executable):
Target "tutorial" links to target "Boost::filesystem" but the target was
not found. Perhaps a find_package() call is missing for an IMPORTED
target, or an ALIAS target is missing?

CMake Error at tutorial/CMakeLists.txt:60 (add_executable):
Target "tutorial" links to target "Boost::system" but the target was not
found. Perhaps a find_package() call is missing for an IMPORTED target, or
an ALIAS target is missing?

-- Generating done
-- Build files have been written to: /home/user1/cldnn/build**

Please help me. I cannot find any solutions. Thank you very much.

How to set the size of weights memory when convolution‘s groups is not equal to 1.

Hi,

As described in the example MNIST network, the size of convolution's weights memory is set to { out_channels, in_channels, kernel_size, kernel_size} when groups is 1. My questions is when groups is not 1, the size is still the same?

Thanks.

Disclaimer in README.md

Could you please comment on the following in README.md
"
You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel® a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.
"
Is this part of the license? Could you please add a license file to the top level directory which covers everything in the repository?

macOS build fail

I try to compile on macOS. I have installed boost by package manager Homebrew, but Cmake can't find boost. Here is error message:

  Could not find the following static Boost libraries:

          boost_system
          boost_date_time
          boost_program_options
          boost_filesystem

  No Boost libraries were found.  You may need to set BOOST_LIBRARYDIR to the
  directory containing Boost libraries or BOOST_ROOT to the location of
  Boost.
Call Stack (most recent call first):
  CMakeLists.txt:577 (find_package)

CMake Error at CMakeCompilerLinkerOpts.txt:328 (message):
  [clDNN] Unknown compiler.  Please define support for it or use different
  compiler.
Call Stack (most recent call first):
  CMakeLists.txt:709 (include)

use CMAKE_BINARY_DIR appropriately

https://github.com/intel/clDNN/blob/master/CMakeLists.txt#L104 causes an issue where systems will compile a binary with avx2 and another binary with avx512. Because CMAKE_BINARY_DIR is not used, whichever binary that was compiled last will override the former. When compiling with different instruction sets or optimizations, distros will create build_avx2 or build_avx512 directories and set CMAKE_BINARY_DIR to one of those. This way different build types do not conflict.

The solution here is NOT to reinvent the wheel by setting binary output to places other than CMAKE_BINARY_DIR.

clDNNPlugin R5 cannot be built with clDNN Drop 12.1 without graphics driver installed

I'm trying to build OpenVINO 2018 R5 with Drop 12.1 (since Intel's distribution contains an earlier version of clDNN, like something before Drop 11, which features a horrendous memory leak).
Due to absence of Intel Graphics on my CPU the graphics driver refuses to install, which result in a clDNNPlugin linking error to OpenCL.lib.

I've traced the issue to clDNN's build script:

clDNN/src/CMakeLists.txt

Lines 234 to 239 in f91d7d8

    
           target_link_libraries("${CLDNN_BUILD__PROJ}" 
        
               OpenCL 
        
               Boost::filesystem 
        
               Boost::system 
        
               cldnn_kernel_selector 
        
             )

while it sets OpenCL.lib as a public link library, it does not propagate corresponding link directory to consumers in the same way (I'm not sure what happens to link_directories from the root script, though it is not respected by OpenVINO's scripts, and I don't think it's a good practice to propagate target's dependencies via include_directories, link_directories and similar global commands).

I'm not sure if it's the best place for a fix (moreover I think it would be better to create an import target for OpenCL), though it definitely resolves the link issue:

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 6313d50..10c5b88 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -237,6 +237,9 @@ target_link_libraries("${CLDNN_BUILD__PROJ}"
     Boost::system
     cldnn_kernel_selector
   )
+target_link_directories("${CLDNN_BUILD__PROJ}"
+    INTERFACE ${CLDNN__IOCL_ICD_LIBDIRS}
+  )

 if(WIN32)
   target_link_libraries("${CLDNN_BUILD__PROJ}" setupapi)

clDNN and OpenCL SDK

I'm new to clDNN. Does clDNN need to be run in OpenCL SDK?

padding

As per https://software.intel.com/en-us/articles/accelerating-deep-learning-inference-with-intel-processor-graphics, padding can be achieved by having a output_padding set to the layer.

I have a net which is like conv1 -> pool1 -> conv2 -> pool -> fc1 -> fc2 -> softmax

When I put output_padding to pool1 layer and run net only till that I can see output being padded correctly for pool1. However, when I connect pool1 with output_padding to conv2 it dones't seem to pad the data.

I tried also putting an explicit reorder with output_padding in b/w the pool1 and conv2 still doesn't seem to pad the output of pool1.

deconvolution is very slow

I think the prediction speed of clDNN is generally very good and it outperforms MKL on the same
processor for many operations I have tested. But the deconvolution operation seem to be very slow.
On Core i3-6100 and i5-6500 deconvolution takes approximately 40-50 times longer with clDNN than MKL in my tests. That is such a big difference that I don't think it is caused simply by lack of optimization.

See attached test case for details of how I measured it.
speed.zip

Missing include

kernel_selector/core/common/primitive_db.cpp is missing #include <stdexcept> and thus does not compile with VS 2019 due to undeclared std::runtime_error.

[Question] The way of compilation

How can we compile tutorial/main.cpp? Please show me the command.

"$ make tests" failed.

After running command "$ make tests", 8 tests are shown as failed. Also showing error:

tests/CMakeFiles/tests.dir/build.make:867: recipe for target 'build/out/Linux64/Debug/tests64' failed

What could be the reason?
Log:

[----------] Global test environment tear-down
[==========] 525 tests from 89 test cases ran. (151136 ms total)
[ PASSED ] 517 tests.
[ FAILED ] 8 tests, listed below:
[ FAILED ] convolution_grad_weights_f32_fw_gpu.basic_wsiz2x2_in2x2x1x2_bfyx_stride2_pad1_fwd_backw
[ FAILED ] convolution_grad_weights_f32_fw_gpu.basic_wsiz1x1_in1x2x5x5_bfyx_stride2_pad1
[ FAILED ] convolution_grad_weights_f32_fw_gpu.basic_wsiz2x2_in32x1x2x2_yxfb_stride1
[ FAILED ] memory_pool.basic_non_padded_relu_pipe
[ FAILED ] memory_pool.basic_non_padded_relu_and_pooling_pipe
[ FAILED ] memory_pool.multi_outputs_network
[ FAILED ] memory_pool.shared_mem_pool_same_topology_twice
[ FAILED ] memory_pool.shared_mem_pool_same_topology_twice_weights

8 FAILED TESTS
YOU HAVE 17245 DISABLED TESTS

tests/CMakeFiles/tests.dir/build.make:867: recipe for target 'build/out/Linux64/Debug/tests64' failed
make[3]: *** [build/out/Linux64/Debug/tests64] Error 1
make[3]: *** Deleting file 'build/out/Linux64/Debug/tests64'
CMakeFiles/Makefile2:202: recipe for target 'tests/CMakeFiles/tests.dir/all' failed
make[2]: *** [tests/CMakeFiles/tests.dir/all] Error 2
CMakeFiles/Makefile2:214: recipe for target 'tests/CMakeFiles/tests.dir/rule' failed
make[1]: *** [tests/CMakeFiles/tests.dir/rule] Error 2
Makefile:190: recipe for target 'tests' failed
make: *** [tests] Error 2

Why does clDNN conv2d barely use any GPU shared memory (__local)

Hi clDNN team! I recently look into your convolution code and find that expect in winograd algorithm, the conv2d primitive doesn't use any __local cache which should be the fastest gpu cache. I ran on Intel Gen9 GPU and the convolution is still pretty fast. I'm still studying the story behind the performance boost, and it'll be great if you could give any insights,

Are there available classic networks topologies (alexnet, vgg etc.) to use?

Hi! In clDNN documentation, it is said that users can use the primitive set to build and execute most common image recognition, semantic segmentation and object detection networks topologies. But I could not found the relevant files in the project. Can you tell me how to find them? Thanks!

Drop 12.0 does not work with OpenVINO R4

Getting
primitive add failed: basic_string::_S_construct null not valid
while trying to replace original OpenVINO's libclDNN64.so with a Drop 12.0 build (due to horrendous memory leak at ~1.5MB/s).

Build fails

Linux build fails against commit 02add7c

my Linux box is ubuntu 16.04.5

Build cldnn with mkdir build; cd build; cmake ..; make

The error message is:

[ 51%] Built target cldnn_kernel_selector
make[2]: Circular codegen/test_builds/api_c_test.c <- codegen/test_builds/api_c_test.c dependency dropped.
make[2]: Circular codegen/test_builds/api_cpp_test.cpp <- codegen/test_builds/api_cpp_test.cpp dependency dropped.
make[2]: Circular codegen/test_builds/api_cpp_test.cpp <- codegen/test_builds/api_cpp_test.cpp dependency dropped.
make[2]: Circular codegen/test_builds/api_c_test.c <- codegen/test_builds/api_c_test.c dependency dropped.
[ 51%] Building C object api_test_builds/CMakeFiles/api_test_builds.dir/__/codegen/test_builds/api_c_test.c.o
In file included from /home/nhu/code/clDNN/build/codegen/test_builds/api_c_test.c:16:0:
/home/nhu/code/clDNN/api/C/pooling.h:56:1: error: unknown type name ‘bool’
 bool global_pooling;
 ^
api_test_builds/CMakeFiles/api_test_builds.dir/build.make:213: recipe for target 'api_test_builds/CMakeFiles/api_test_builds.dir/__/codegen/test_builds/api_c_test.c.o' failed
make[2]: *** [api_test_builds/CMakeFiles/api_test_builds.dir/__/codegen/test_builds/api_c_test.c.o] Error 1
CMakeFiles/Makefile2:141: recipe for target 'api_test_builds/CMakeFiles/api_test_builds.dir/all' failed
make[1]: *** [api_test_builds/CMakeFiles/api_test_builds.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

Missing throw?

Should it be "throw std::invalid_argument(...)" ?

cldnn_memory cldnn_attach_memory(cldnn_layout layout, void* pointer, size_t size, cldnn_status* status)
{
    return exception_handler<cldnn_memory>(CLDNN_ERROR, status, nullptr, [&]()
    {
        cldnn::layout layout_obj(layout);
        if (layout_obj.bytes_count() > size) 
            std::invalid_argument("buffer size does not match layout size");
        return api_cast(new cldnn::simple_attached_memory(layout_obj, pointer));
    });
}

	target_link_libraries("${CLDNN_BUILD__PROJ}"
	OpenCL
	Boost::filesystem
	Boost::system
	cldnn_kernel_selector
	)

intel / cldnn Goto Github PK

cldnn's Issues

Recommend Projects

Recommend Topics

Recommend Org