FeatherCNN is a high-performance lightweight CNN inference library, developed by Tencent AI Platform Department. FeatureCNN origins from our game AI project for King of Glory (Chinese: 王者荣耀), in which we aim to build a neural model for MOBA game AI and run it on mobile devices. FeatherCNN currently targets at ARM CPUs. We will extend it to cover other architecutures in the near future.
Comparing with other libraries, FeatherCNN has the following features:
-
High Performance FeatherCNN delivers state-of-the-art inference computing performance on a wide range of devices, including mobile phones (iOS/Android), embedded devices (Linux) as well as ARM-based servers (Linux).
-
Easy Deployment FeatherCNN packs everything in a single code base to get rid of third-party dependencies. Hence, it facilitates deployment on mobile platforms.
-
Featherweight The compiled FeatherCNN library is small-sized (hundreds of KBs).
Please kindly open an issue in this repo for bug reports and enhancement suggests. We are grateful to user responses and will actively polish this library.
FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures (TPDS September 2019, In press, DOI:10.1109/TPDS.2019.2939785)
The FeatherCNN repository has a heavy development history, please only clone the master branch as follows:
git clone -b master --single-branch https://github.com/tencent/FeatherCNN.git
FeatherCNN accepts Caffemodels. It merges the structure file (.prototxt) and the weight file (.caffemodel) into a single binary model (.feathermodel). The convert tool requires protobuf, but you don't need them for the library.
The basic user interfaces are listed in feather/net.h. Currently we are using raw pointers to reference data. We may provide more convenient interfaces in the near future.
Before inference, FeatherCNN requires two steps to initialize the network.
feather::Net forward_net(num_threads);
forward_net.InitFromPath(FILE_PATH_TO_FEATHERMODEL);
The net can also be initialized with raw buffers and FILE pointers.
We can perform forward computation with raw float*
buffer consequently.
forward_net.Forward(PTR_TO_YOUR_INPUT_DATA);
The output can be extracted from the net by the name of blobs. The blob names are kept consistent with caffe prototxt.
forward_net.ExtractBlob(PTR_TO_YOUR_OUTPUT_BUFFER, BLOB_NAME);
BTW, you can also get the blob's data size by calling
size_t data_size = 0;
forward_net.GetBlobDataSize(&data_size, BLOB_NAME);
We have tested FeatherCNN on a bunch of devices, see this page for details.
Telegram: https://t.me/FeatherCNN
QQ: 728147343
feathercnn's People
Forkers
fanbbbb bug1989 xyt2008 luojie1024 rogerou arlose runauto harhar539 xdumengyu littlecyj biranli juventi zhy520xp neo-vincent wonderzy wurq sysau yingyingshin l1129433134 column6942 alannewimage rkshuai albertmf erichouyi keyky barongeng starstylesky zf8848 chenqy4933 daydreamcoding shubhampachori12110095 lyk125 xingjinglu ichejun rivid longsui orangels jackcc jdcbbk tonytangyu wqvbjhc hoijanlai hulalazz 1093842024 ml-lab skymysky lisir huayong paojianghu baaaaaarry labimage susyimes gaoyiyeah mx2017 babyzpj liyancas luciffinil kobecjy 03050903 qtaozhou1994 hellogiantman1989 jokerc8 xialuxi linglan2020 wormcoder arsenluca shlpu jay2002 linkin-ygw zgsxwsdxg smilejx devopsmi xtanitfy burness davidmr001 fitrialif grseb9s btbujiangjun yuechengyin caozhengquan ahuang1900 lee-bin zhaoluo colinsongf wenchao-du botaichang peralhuang tonyle9 linuxdigger chybhao666 lovelyboy1 jinzhilin cheason aaronyking supernovaer chenxingqiang hesitationer glb-seu yinxx messiliaofeathercnn's Issues
cannot execute binary file.这个是我在PC上的linux出错了,请问是什么问题呢
bash: ./feather_benchmark: cannot execute binary file: Exec format error
sgemm.cpp:externalPackA
if (M>mc)
remPack will be wrong!
跑MTCNN PNET的时候 Winograd F(6,3) 初始化失败
挂在private_mempool.Free(&ST) 这一行。
请问转换caffe模型的文件是feather_convert_caffe还是caffe_model_convert
对转换工具编译时只生成了feather_convert_caffe这个文件,通过这个文件也能将caffe模型转换成feathermodel,这样转化的featherCNN模型正确吗,另外编译时需要安装protobuf吗?
还有一个问题我的caffe网络有44层(Convolution,Eltwise,ReLU三种类型),调用Forward(float *input)执行网络时layers.size的值为33,ReLU层有11层,这样解析的layers.size正确吗?
loadparam
// printf("bottom name %s\n", bottom_name);
// layer->bottoms[j] = new Blob<float>(bottom_name);
std::map<std::string, Blob<float> *>::iterator map_iter = blob_map.find(bottom_name);
if (( map_iter == blob_map.end()) && (layer->type.compare("Input") != 0))
{
LOGE("Topology error: bottom blob %s of layer %s type %s not found in map.", bottom_name, layer_name, layer_type);
return -300;
}
在解析网络配置文件时,配置文件中blob在blob_map中进行查找,如果找不到就返回,对于新的bottom blob不是应该新创建吗,如果直接返回的话,能完成网络的解析吗?
Supported layers/operators
Can you provide what full list of layers/operators supported by your engine in each framework (caffe/tensorflow)?
InitFromPath never returns
I've converted Yahoo's Open NSFW model to your format. Whenever I try to load the model with InitFromPath
, it just never returns. It pegs a single CPU core to 100% and just spins and spins. I'm wondering if a slow startup time is expected with such a model (23 megs, ResNet model).
Net::ExtractBlob() error
我在跑 ios 程序的时候 ,最后输出结果时调用 Net::ExtractBlob(float** output_ptr, std::string name) 这个函数报错
调用部分代码是这样的:
float p = NULL;
forward_net.ExtractBlob(&p, "fc7");
请问是这个 指针初始化的问题么?
在这个 forward_net.ExtractBlob(float output_ptr, std::string name) 实现里给数组指针分配内存,在函数内部为什么要
assert(output_ptr == NULL);
Performance testing using experimental branch on Andriod
Please, has anyone tried running performance tests using the experimental branch on Android?
I have tried this but had to manually update the framework to build for Android. This did not work by just following the "Android ADB guide". Also, i had to change the source in the target to test_txt.cpp instead of test.bin.cpp. Please what is the difference between these files.?
Also, the benchmark results from CPU only runs using the experimental branch is quite slow.
Mate10 - MobileNet
A73 - 220 ms
A53 - 589 ms
This is based on loop size of 10 and thread size of 4.
Finally, when i configured the framework for GPU runs ( i.e setting DeviceType::GPU_CL). The times i obtained for GPU runs where ridiculous fast ~ 5ms for Mobilenet.
Please, are there additional steps for running the performance test using GPU_CL config on Android?
support Int8 computer or not?if not , do you have the plan to support?
as see in the issue title, I am wondering if featherCNN have int compute support while ncnn support int8 compute
test error
when I run the feather_benchmark,like this:
./feather_benchmark ./data/mobilenet.feathermodel ./data/input_3x224x224.txt 20 4
can't run:
bash: ./feather_benchmark: cannot execute binary file: 可执行文件格式错误
支持mobileNet-ssd吗?
这个项目还在维护吗?
编译脚本有问题:没有 .cl 文件夹
代码编译有问题: net.cpp 288行函数调用歧义。
conv_layer.h 71行, this->name 应为 this->name.c_str()
请问该框架与ncnn的区别与联系?
你好,请问该框架与ncnn的区别与联系?有哪些优势?另外我看ncnn实现了int8量化,但是会比较简单,你们有什么样的计划,是否会做出新的方法?
期待你们的回复,谢谢!
benchmarking
How to build the feather_benchmark ? Could you please help?
whether support CUDA or not? And tensorflow model or not?
if this project can support run on GPU and offer tensorflow model convert will be greater.
like TX2?
Add support for tf or split+transpose in caffe.
Would you please add support for tf or split+transpose in caffe, thank you.
typo and why so large input buffer
Below two lines are in feather/test_txt.cpp
size_t input_size = 224 * 2224 * 3 ;
float *input = new float[input_size * 20];
- typo 2224->224?
- why do you allocate 20 times of input size, seems like each time only use float[input_size], is there a reason for 20 times buffer?
Layer type Deconvolution not registered
Failed to call InitFromPath().
Error:
Finished loading from file
Layer type Deconvolution not registered
Layer type Deconvolution not registered
Layer type Sigmoid not registered
bottom name ...
...
Segmentation fault (core dumped)
Cann't we use Deconvolution layer or Sigmoid layer?
sgemm conv 4 threads, result wrong
as title
Do you plan to support convert pytorch model to feathercnn model?
build出错
运行
./build_scripts/build_linux.sh
报错
In file included from /output/FeatherCNN/src/layer_factory.cpp:39:0: /output/FeatherCNN/src/layers/filter_layer.h: In constructor ‘feather::FilterLayer::FilterLayer(const feather::LayerParameter*, const RuntimeParameter<float>*)’: /output/FeatherCNN/src/layers/filter_layer.h:29:39: error: ‘const struct feather::LayerParameter’ has no member named ‘filter_param’ num_output = layer_param->filter_param()->num_output(); ^ src/CMakeFiles/feather.dir/build.make:101: recipe for target 'src/CMakeFiles/feather.dir/layer_factory.cpp.o' failed make[2]: *** [src/CMakeFiles/feather.dir/layer_factory.cpp.o] Error 1 CMakeFiles/Makefile2:87: recipe for target 'src/CMakeFiles/feather.dir/all' failed
ubuntu 16.04
supported layer mismatch between layer_factory.cpp and feather_convert_caffe.cc
These are what are supported in layer_factory.cpp
void register_layer_creators()
{
REGISTER_LAYER_CREATOR(Input, GetInputLayer);
REGISTER_LAYER_CREATOR(Convolution, GetConvolutionLayer);
REGISTER_LAYER_CREATOR(DepthwiseConvolution, GetDepthwiseConvolutionLayer);
REGISTER_LAYER_CREATOR(BatchNorm, GetBatchNormLayer);
REGISTER_LAYER_CREATOR(LRN, GetLRNLayer);
REGISTER_LAYER_CREATOR(Concat, GetConcatLayer);
REGISTER_LAYER_CREATOR(Dropout, GetDropoutLayer);
REGISTER_LAYER_CREATOR(ReLU, GetReluLayer);
REGISTER_LAYER_CREATOR(PReLU, GetPReluLayer);
REGISTER_LAYER_CREATOR(Scale, GetScaleLayer);
REGISTER_LAYER_CREATOR(Slice, GetSliceLayer);
REGISTER_LAYER_CREATOR(Pooling, GetPoolingLayer);
REGISTER_LAYER_CREATOR(Eltwise, GetEltwiseLayer);
REGISTER_LAYER_CREATOR(InnerProduct, GetInnerProductLayer);
REGISTER_LAYER_CREATOR(Softmax, GetSoftmaxLayer);
REGISTER_LAYER_CREATOR(Filter, GetFilterLayer);
REGISTER_LAYER_CREATOR(Reshape, GetReshapeLayer);
}
These are what are supported in
feather_convert_caffe.cc
layer_type.compare("Input")
if (layer_type.compare("Convolution") == 0 || (layer_type.compare("DepthwiseConvolution") == 0))
else if (layer_type.compare("LRN") == 0)
else if (layer_type.compare("Pooling") == 0)
else if (layer_type.compare("Interp") == 0)
else if (layer_type.compare("InnerProduct") == 0)
else if (layer_type.compare("Softmax") == 0)
else if (layer_type.compare("Scale") == 0)
else if (layer_type.compare("Eltwise") == 0)
else if (layer_type.compare("Flatten") == 0)
else if (layer_type.compare("Filter") == 0)
There are a few not supported in this parser. Can you confirm only layers Input, Convolution, LRN, Polling, Interp, InnerProduct, Softmax, Scale, Eltwise, Flattern, Filter are supported?
Does this also support RPi3B with Raspbian OS (armv7l)
I can see there is toolchain for aarch64 arm device, so we can build in RPiB3 with 64-bit OS.
How about raspbian os which has armv7l. How can I build in such device.
Thanks.
MobileNet 4 threads result wrong, but 1/2 threads ok
MobileNet 4 threads result wrong, but 1/2 threads ok
Running bvlc_googlenet with one thread, segmentation fault occured
Platform: Raspberry Pi 3 / Linux ubuntu 4.14.37 aarch64 GNU/Linux
Model: bvlc_googlenet in
Input data: ./data/input_3x224x224.txt
Description:
Running the model with 2 or more than 2 threads, it can function well.
./feather_benchmark ./bvlc_googlenet/bvlc_googlenet.feathermodel ./data/input_3x224x224.txt 20 2
++++++Start Loader++++++
Finished loading from file
-- Loading 143 layers
input num 1 input dim num 4
input_name data (n c h w) (10 3 224 224)
stride 2, 2
_bottom data
setup layer conv1/7x7_s2
_bottom conv1/7x7_s2
setup layer conv1/relu_7x7
_bottom conv1/relu_7x7
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool1/3x3_s2
_bottom pool1/3x3_s2
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer pool1/norm1
stride 1, 1
_bottom pool1/norm1
setup layer conv2/3x3_reduce
_bottom conv2/3x3_reduce
setup layer conv2/relu_3x3_reduce
stride 1, 1
_bottom conv2/relu_3x3_reduce
setup layer conv2/3x3
_bottom conv2/3x3
setup layer conv2/relu_3x3
_bottom conv2/relu_3x3
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer conv2/norm2
_bottom conv2/norm2
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool2/3x3_s2
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/1x1
_bottom inception_3a/1x1
setup layer inception_3a/relu_1x1
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/3x3_reduce
_bottom inception_3a/3x3_reduce
setup layer inception_3a/relu_3x3_reduce
stride 1, 1
_bottom inception_3a/relu_3x3_reduce
setup layer inception_3a/3x3
_bottom inception_3a/3x3
setup layer inception_3a/relu_3x3
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/5x5_reduce
_bottom inception_3a/5x5_reduce
setup layer inception_3a/relu_5x5_reduce
stride 1, 1
_bottom inception_3a/relu_5x5_reduce
setup layer inception_3a/5x5
_bottom inception_3a/5x5
setup layer inception_3a/relu_5x5
_bottom pool2/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3a/pool
stride 1, 1
_bottom inception_3a/pool
setup layer inception_3a/pool_proj
_bottom inception_3a/pool_proj
setup layer inception_3a/relu_pool_proj
_bottom inception_3a/relu_1x1
_bottom inception_3a/relu_3x3
_bottom inception_3a/relu_5x5
_bottom inception_3a/relu_pool_proj
setup layer inception_3a/output
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/1x1
_bottom inception_3b/1x1
setup layer inception_3b/relu_1x1
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/3x3_reduce
_bottom inception_3b/3x3_reduce
setup layer inception_3b/relu_3x3_reduce
stride 1, 1
_bottom inception_3b/relu_3x3_reduce
setup layer inception_3b/3x3
_bottom inception_3b/3x3
setup layer inception_3b/relu_3x3
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/5x5_reduce
_bottom inception_3b/5x5_reduce
setup layer inception_3b/relu_5x5_reduce
stride 1, 1
_bottom inception_3b/relu_5x5_reduce
setup layer inception_3b/5x5
_bottom inception_3b/5x5
setup layer inception_3b/relu_5x5
_bottom inception_3a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3b/pool
stride 1, 1
_bottom inception_3b/pool
setup layer inception_3b/pool_proj
_bottom inception_3b/pool_proj
setup layer inception_3b/relu_pool_proj
_bottom inception_3b/relu_1x1
_bottom inception_3b/relu_3x3
_bottom inception_3b/relu_5x5
_bottom inception_3b/relu_pool_proj
setup layer inception_3b/output
_bottom inception_3b/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool3/3x3_s2
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/1x1
_bottom inception_4a/1x1
setup layer inception_4a/relu_1x1
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/3x3_reduce
_bottom inception_4a/3x3_reduce
setup layer inception_4a/relu_3x3_reduce
stride 1, 1
_bottom inception_4a/relu_3x3_reduce
setup layer inception_4a/3x3
_bottom inception_4a/3x3
setup layer inception_4a/relu_3x3
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/5x5_reduce
_bottom inception_4a/5x5_reduce
setup layer inception_4a/relu_5x5_reduce
stride 1, 1
_bottom inception_4a/relu_5x5_reduce
setup layer inception_4a/5x5
_bottom inception_4a/5x5
setup layer inception_4a/relu_5x5
_bottom pool3/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4a/pool
stride 1, 1
_bottom inception_4a/pool
setup layer inception_4a/pool_proj
_bottom inception_4a/pool_proj
setup layer inception_4a/relu_pool_proj
_bottom inception_4a/relu_1x1
_bottom inception_4a/relu_3x3
_bottom inception_4a/relu_5x5
_bottom inception_4a/relu_pool_proj
setup layer inception_4a/output
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/1x1
_bottom inception_4b/1x1
setup layer inception_4b/relu_1x1
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/3x3_reduce
_bottom inception_4b/3x3_reduce
setup layer inception_4b/relu_3x3_reduce
stride 1, 1
_bottom inception_4b/relu_3x3_reduce
setup layer inception_4b/3x3
_bottom inception_4b/3x3
setup layer inception_4b/relu_3x3
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/5x5_reduce
_bottom inception_4b/5x5_reduce
setup layer inception_4b/relu_5x5_reduce
stride 1, 1
_bottom inception_4b/relu_5x5_reduce
setup layer inception_4b/5x5
_bottom inception_4b/5x5
setup layer inception_4b/relu_5x5
_bottom inception_4a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4b/pool
stride 1, 1
_bottom inception_4b/pool
setup layer inception_4b/pool_proj
_bottom inception_4b/pool_proj
setup layer inception_4b/relu_pool_proj
_bottom inception_4b/relu_1x1
_bottom inception_4b/relu_3x3
_bottom inception_4b/relu_5x5
_bottom inception_4b/relu_pool_proj
setup layer inception_4b/output
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/1x1
_bottom inception_4c/1x1
setup layer inception_4c/relu_1x1
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/3x3_reduce
_bottom inception_4c/3x3_reduce
setup layer inception_4c/relu_3x3_reduce
stride 1, 1
_bottom inception_4c/relu_3x3_reduce
setup layer inception_4c/3x3
_bottom inception_4c/3x3
setup layer inception_4c/relu_3x3
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/5x5_reduce
_bottom inception_4c/5x5_reduce
setup layer inception_4c/relu_5x5_reduce
stride 1, 1
_bottom inception_4c/relu_5x5_reduce
setup layer inception_4c/5x5
_bottom inception_4c/5x5
setup layer inception_4c/relu_5x5
_bottom inception_4b/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4c/pool
stride 1, 1
_bottom inception_4c/pool
setup layer inception_4c/pool_proj
_bottom inception_4c/pool_proj
setup layer inception_4c/relu_pool_proj
_bottom inception_4c/relu_1x1
_bottom inception_4c/relu_3x3
_bottom inception_4c/relu_5x5
_bottom inception_4c/relu_pool_proj
setup layer inception_4c/output
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/1x1
_bottom inception_4d/1x1
setup layer inception_4d/relu_1x1
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/3x3_reduce
_bottom inception_4d/3x3_reduce
setup layer inception_4d/relu_3x3_reduce
stride 1, 1
_bottom inception_4d/relu_3x3_reduce
setup layer inception_4d/3x3
_bottom inception_4d/3x3
setup layer inception_4d/relu_3x3
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/5x5_reduce
_bottom inception_4d/5x5_reduce
setup layer inception_4d/relu_5x5_reduce
stride 1, 1
_bottom inception_4d/relu_5x5_reduce
setup layer inception_4d/5x5
_bottom inception_4d/5x5
setup layer inception_4d/relu_5x5
_bottom inception_4c/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4d/pool
stride 1, 1
_bottom inception_4d/pool
setup layer inception_4d/pool_proj
_bottom inception_4d/pool_proj
setup layer inception_4d/relu_pool_proj
_bottom inception_4d/relu_1x1
_bottom inception_4d/relu_3x3
_bottom inception_4d/relu_5x5
_bottom inception_4d/relu_pool_proj
setup layer inception_4d/output
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/1x1
_bottom inception_4e/1x1
setup layer inception_4e/relu_1x1
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/3x3_reduce
_bottom inception_4e/3x3_reduce
setup layer inception_4e/relu_3x3_reduce
stride 1, 1
_bottom inception_4e/relu_3x3_reduce
setup layer inception_4e/3x3
_bottom inception_4e/3x3
setup layer inception_4e/relu_3x3
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/5x5_reduce
_bottom inception_4e/5x5_reduce
setup layer inception_4e/relu_5x5_reduce
stride 1, 1
_bottom inception_4e/relu_5x5_reduce
setup layer inception_4e/5x5
_bottom inception_4e/5x5
setup layer inception_4e/relu_5x5
_bottom inception_4d/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4e/pool
stride 1, 1
_bottom inception_4e/pool
setup layer inception_4e/pool_proj
_bottom inception_4e/pool_proj
setup layer inception_4e/relu_pool_proj
_bottom inception_4e/relu_1x1
_bottom inception_4e/relu_3x3
_bottom inception_4e/relu_5x5
_bottom inception_4e/relu_pool_proj
setup layer inception_4e/output
_bottom inception_4e/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool4/3x3_s2
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/1x1
_bottom inception_5a/1x1
setup layer inception_5a/relu_1x1
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/3x3_reduce
_bottom inception_5a/3x3_reduce
setup layer inception_5a/relu_3x3_reduce
stride 1, 1
_bottom inception_5a/relu_3x3_reduce
setup layer inception_5a/3x3
_bottom inception_5a/3x3
setup layer inception_5a/relu_3x3
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/5x5_reduce
_bottom inception_5a/5x5_reduce
setup layer inception_5a/relu_5x5_reduce
stride 1, 1
_bottom inception_5a/relu_5x5_reduce
setup layer inception_5a/5x5
_bottom inception_5a/5x5
setup layer inception_5a/relu_5x5
_bottom pool4/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5a/pool
stride 1, 1
_bottom inception_5a/pool
setup layer inception_5a/pool_proj
_bottom inception_5a/pool_proj
setup layer inception_5a/relu_pool_proj
_bottom inception_5a/relu_1x1
_bottom inception_5a/relu_3x3
_bottom inception_5a/relu_5x5
_bottom inception_5a/relu_pool_proj
setup layer inception_5a/output
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/1x1
_bottom inception_5b/1x1
setup layer inception_5b/relu_1x1
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/3x3_reduce
_bottom inception_5b/3x3_reduce
setup layer inception_5b/relu_3x3_reduce
stride 1, 1
_bottom inception_5b/relu_3x3_reduce
setup layer inception_5b/3x3
_bottom inception_5b/3x3
setup layer inception_5b/relu_3x3
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/5x5_reduce
_bottom inception_5b/5x5_reduce
setup layer inception_5b/relu_5x5_reduce
stride 1, 1
_bottom inception_5b/relu_5x5_reduce
setup layer inception_5b/5x5
_bottom inception_5b/5x5
setup layer inception_5b/relu_5x5
_bottom inception_5a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5b/pool
stride 1, 1
_bottom inception_5b/pool
setup layer inception_5b/pool_proj
_bottom inception_5b/pool_proj
setup layer inception_5b/relu_pool_proj
_bottom inception_5b/relu_1x1
_bottom inception_5b/relu_3x3
_bottom inception_5b/relu_5x5
_bottom inception_5b/relu_pool_proj
setup layer inception_5b/output
_bottom inception_5b/output
kernel (7 7) pad (0 0) stride (1 1) global_pooling 0
setup layer pool5/7x7_s1
_bottom pool5/7x7_s1
setup layer pool5/drop_7x7_s1
_bottom pool5/drop_7x7_s1
----BlobInfo----
Shape in nchw (1000 1024 1 1)
----------------
setup layer loss3/classifier
_bottom loss3/classifier
setup layer prob
Output shape 256 28 28
Output shape 480 28 28
Output shape 512 14 14
Output shape 512 14 14
Output shape 512 14 14
Output shape 528 14 14
Output shape 832 14 14
Output shape 832 7 7
Output shape 1024 7 7
input 1024 1 1
----BlobInfo----
Shape in nchw (1 1000 1 1)
----------------
old bottom conv2/relu_3x3 to new bottom conv2/3x3
*old bottom conv2/relu_3x3 to new bottom conv2/3x3
+old bottom conv2/relu_3x3 to new bottom conv2/3x3
Erasing layer 8 conv2/relu_3x3
Layer 8 after erasing: conv2/norm2 type LRN
old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
*old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
+old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
Erasing layer 15 inception_3a/relu_3x3
Layer 15 after erasing: inception_3a/5x5_reduce type Convolution
old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
*old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
+old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
Erasing layer 28 inception_3b/relu_3x3
Layer 28 after erasing: inception_3b/5x5_reduce type Convolution
old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
*old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
+old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
Erasing layer 42 inception_4a/relu_3x3
Layer 42 after erasing: inception_4a/5x5_reduce type Convolution
old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
*old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
+old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
Erasing layer 55 inception_4b/relu_3x3
Layer 55 after erasing: inception_4b/5x5_reduce type Convolution
old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
*old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
+old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
Erasing layer 68 inception_4c/relu_3x3
Layer 68 after erasing: inception_4c/5x5_reduce type Convolution
old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
*old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
+old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
Erasing layer 81 inception_4d/relu_3x3
Layer 81 after erasing: inception_4d/5x5_reduce type Convolution
old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
*old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
+old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
Erasing layer 94 inception_4e/relu_3x3
Layer 94 after erasing: inception_4e/5x5_reduce type Convolution
old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
*old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
+old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
Erasing layer 108 inception_5a/relu_3x3
Layer 108 after erasing: inception_5a/5x5_reduce type Convolution
old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
*old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
+old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
Erasing layer 121 inception_5b/relu_3x3
Layer 121 after erasing: inception_5b/5x5_reduce type Convolution
input size 150528 parts size 150528
Forward
----------Prediction costs 1217.012244ms
Forward
----------Prediction costs 1147.220138ms
Forward
----------Prediction costs 1117.961493ms
Forward
----------Prediction costs 682.398315ms
Forward
----------Prediction costs 682.362640ms
Forward
----------Prediction costs 682.154001ms
Forward
----------Prediction costs 683.279697ms
Forward
----------Prediction costs 683.667342ms
Forward
----------Prediction costs 682.445607ms
Forward
----------Prediction costs 682.233636ms
Forward
----------Prediction costs 682.563834ms
Forward
----------Prediction costs 682.186085ms
Forward
----------Prediction costs 682.851743ms
Forward
----------Prediction costs 682.801224ms
Forward
----------Prediction costs 682.941741ms
Forward
----------Prediction costs 684.093530ms
Forward
----------Prediction costs 682.520608ms
Forward
----------Prediction costs 682.639928ms
Forward
----------Prediction costs 682.725135ms
Forward
----------Prediction costs 682.587327ms
--------Average runtime 730.086001msi------
Warning: common memroy not freed before pool desctruction. Proceed with free.
Default common pool stat: size 8463360, ptr 2092d760
double free or corruption (!prev)
But with one thread:
++++++Start Loader++++++
Finished loading from file
-- Loading 143 layers
input num 1 input dim num 4
input_name data (n c h w) (10 3 224 224)
stride 2, 2
_bottom data
setup layer conv1/7x7_s2
_bottom conv1/7x7_s2
setup layer conv1/relu_7x7
_bottom conv1/relu_7x7
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool1/3x3_s2
_bottom pool1/3x3_s2
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer pool1/norm1
stride 1, 1
_bottom pool1/norm1
setup layer conv2/3x3_reduce
_bottom conv2/3x3_reduce
setup layer conv2/relu_3x3_reduce
stride 1, 1
_bottom conv2/relu_3x3_reduce
setup layer conv2/3x3
_bottom conv2/3x3
setup layer conv2/relu_3x3
_bottom conv2/relu_3x3
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer conv2/norm2
_bottom conv2/norm2
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool2/3x3_s2
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/1x1
_bottom inception_3a/1x1
setup layer inception_3a/relu_1x1
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/3x3_reduce
_bottom inception_3a/3x3_reduce
setup layer inception_3a/relu_3x3_reduce
stride 1, 1
_bottom inception_3a/relu_3x3_reduce
setup layer inception_3a/3x3
_bottom inception_3a/3x3
setup layer inception_3a/relu_3x3
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/5x5_reduce
_bottom inception_3a/5x5_reduce
setup layer inception_3a/relu_5x5_reduce
stride 1, 1
_bottom inception_3a/relu_5x5_reduce
setup layer inception_3a/5x5
_bottom inception_3a/5x5
setup layer inception_3a/relu_5x5
_bottom pool2/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3a/pool
stride 1, 1
_bottom inception_3a/pool
setup layer inception_3a/pool_proj
_bottom inception_3a/pool_proj
setup layer inception_3a/relu_pool_proj
_bottom inception_3a/relu_1x1
_bottom inception_3a/relu_3x3
_bottom inception_3a/relu_5x5
_bottom inception_3a/relu_pool_proj
setup layer inception_3a/output
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/1x1
_bottom inception_3b/1x1
setup layer inception_3b/relu_1x1
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/3x3_reduce
_bottom inception_3b/3x3_reduce
setup layer inception_3b/relu_3x3_reduce
stride 1, 1
_bottom inception_3b/relu_3x3_reduce
setup layer inception_3b/3x3
_bottom inception_3b/3x3
setup layer inception_3b/relu_3x3
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/5x5_reduce
_bottom inception_3b/5x5_reduce
setup layer inception_3b/relu_5x5_reduce
stride 1, 1
_bottom inception_3b/relu_5x5_reduce
setup layer inception_3b/5x5
_bottom inception_3b/5x5
setup layer inception_3b/relu_5x5
_bottom inception_3a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3b/pool
stride 1, 1
_bottom inception_3b/pool
setup layer inception_3b/pool_proj
_bottom inception_3b/pool_proj
setup layer inception_3b/relu_pool_proj
_bottom inception_3b/relu_1x1
_bottom inception_3b/relu_3x3
_bottom inception_3b/relu_5x5
_bottom inception_3b/relu_pool_proj
setup layer inception_3b/output
_bottom inception_3b/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool3/3x3_s2
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/1x1
_bottom inception_4a/1x1
setup layer inception_4a/relu_1x1
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/3x3_reduce
_bottom inception_4a/3x3_reduce
setup layer inception_4a/relu_3x3_reduce
stride 1, 1
_bottom inception_4a/relu_3x3_reduce
setup layer inception_4a/3x3
_bottom inception_4a/3x3
setup layer inception_4a/relu_3x3
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/5x5_reduce
_bottom inception_4a/5x5_reduce
setup layer inception_4a/relu_5x5_reduce
stride 1, 1
_bottom inception_4a/relu_5x5_reduce
setup layer inception_4a/5x5
_bottom inception_4a/5x5
setup layer inception_4a/relu_5x5
_bottom pool3/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4a/pool
stride 1, 1
_bottom inception_4a/pool
setup layer inception_4a/pool_proj
_bottom inception_4a/pool_proj
setup layer inception_4a/relu_pool_proj
_bottom inception_4a/relu_1x1
_bottom inception_4a/relu_3x3
_bottom inception_4a/relu_5x5
_bottom inception_4a/relu_pool_proj
setup layer inception_4a/output
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/1x1
_bottom inception_4b/1x1
setup layer inception_4b/relu_1x1
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/3x3_reduce
_bottom inception_4b/3x3_reduce
setup layer inception_4b/relu_3x3_reduce
stride 1, 1
_bottom inception_4b/relu_3x3_reduce
setup layer inception_4b/3x3
_bottom inception_4b/3x3
setup layer inception_4b/relu_3x3
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/5x5_reduce
_bottom inception_4b/5x5_reduce
setup layer inception_4b/relu_5x5_reduce
stride 1, 1
_bottom inception_4b/relu_5x5_reduce
setup layer inception_4b/5x5
_bottom inception_4b/5x5
setup layer inception_4b/relu_5x5
_bottom inception_4a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4b/pool
stride 1, 1
_bottom inception_4b/pool
setup layer inception_4b/pool_proj
_bottom inception_4b/pool_proj
setup layer inception_4b/relu_pool_proj
_bottom inception_4b/relu_1x1
_bottom inception_4b/relu_3x3
_bottom inception_4b/relu_5x5
_bottom inception_4b/relu_pool_proj
setup layer inception_4b/output
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/1x1
_bottom inception_4c/1x1
setup layer inception_4c/relu_1x1
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/3x3_reduce
_bottom inception_4c/3x3_reduce
setup layer inception_4c/relu_3x3_reduce
stride 1, 1
_bottom inception_4c/relu_3x3_reduce
setup layer inception_4c/3x3
_bottom inception_4c/3x3
setup layer inception_4c/relu_3x3
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/5x5_reduce
_bottom inception_4c/5x5_reduce
setup layer inception_4c/relu_5x5_reduce
stride 1, 1
_bottom inception_4c/relu_5x5_reduce
setup layer inception_4c/5x5
_bottom inception_4c/5x5
setup layer inception_4c/relu_5x5
_bottom inception_4b/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4c/pool
stride 1, 1
_bottom inception_4c/pool
setup layer inception_4c/pool_proj
_bottom inception_4c/pool_proj
setup layer inception_4c/relu_pool_proj
_bottom inception_4c/relu_1x1
_bottom inception_4c/relu_3x3
_bottom inception_4c/relu_5x5
_bottom inception_4c/relu_pool_proj
setup layer inception_4c/output
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/1x1
_bottom inception_4d/1x1
setup layer inception_4d/relu_1x1
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/3x3_reduce
_bottom inception_4d/3x3_reduce
setup layer inception_4d/relu_3x3_reduce
stride 1, 1
_bottom inception_4d/relu_3x3_reduce
setup layer inception_4d/3x3
_bottom inception_4d/3x3
setup layer inception_4d/relu_3x3
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/5x5_reduce
_bottom inception_4d/5x5_reduce
setup layer inception_4d/relu_5x5_reduce
stride 1, 1
_bottom inception_4d/relu_5x5_reduce
setup layer inception_4d/5x5
_bottom inception_4d/5x5
setup layer inception_4d/relu_5x5
_bottom inception_4c/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4d/pool
stride 1, 1
_bottom inception_4d/pool
setup layer inception_4d/pool_proj
_bottom inception_4d/pool_proj
setup layer inception_4d/relu_pool_proj
_bottom inception_4d/relu_1x1
_bottom inception_4d/relu_3x3
_bottom inception_4d/relu_5x5
_bottom inception_4d/relu_pool_proj
setup layer inception_4d/output
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/1x1
_bottom inception_4e/1x1
setup layer inception_4e/relu_1x1
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/3x3_reduce
_bottom inception_4e/3x3_reduce
setup layer inception_4e/relu_3x3_reduce
stride 1, 1
_bottom inception_4e/relu_3x3_reduce
setup layer inception_4e/3x3
_bottom inception_4e/3x3
setup layer inception_4e/relu_3x3
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/5x5_reduce
_bottom inception_4e/5x5_reduce
setup layer inception_4e/relu_5x5_reduce
stride 1, 1
_bottom inception_4e/relu_5x5_reduce
setup layer inception_4e/5x5
_bottom inception_4e/5x5
setup layer inception_4e/relu_5x5
_bottom inception_4d/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4e/pool
stride 1, 1
_bottom inception_4e/pool
setup layer inception_4e/pool_proj
_bottom inception_4e/pool_proj
setup layer inception_4e/relu_pool_proj
_bottom inception_4e/relu_1x1
_bottom inception_4e/relu_3x3
_bottom inception_4e/relu_5x5
_bottom inception_4e/relu_pool_proj
setup layer inception_4e/output
_bottom inception_4e/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool4/3x3_s2
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/1x1
_bottom inception_5a/1x1
setup layer inception_5a/relu_1x1
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/3x3_reduce
_bottom inception_5a/3x3_reduce
setup layer inception_5a/relu_3x3_reduce
stride 1, 1
_bottom inception_5a/relu_3x3_reduce
setup layer inception_5a/3x3
_bottom inception_5a/3x3
setup layer inception_5a/relu_3x3
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/5x5_reduce
_bottom inception_5a/5x5_reduce
setup layer inception_5a/relu_5x5_reduce
stride 1, 1
_bottom inception_5a/relu_5x5_reduce
setup layer inception_5a/5x5
_bottom inception_5a/5x5
setup layer inception_5a/relu_5x5
_bottom pool4/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5a/pool
stride 1, 1
_bottom inception_5a/pool
setup layer inception_5a/pool_proj
_bottom inception_5a/pool_proj
setup layer inception_5a/relu_pool_proj
_bottom inception_5a/relu_1x1
_bottom inception_5a/relu_3x3
_bottom inception_5a/relu_5x5
_bottom inception_5a/relu_pool_proj
setup layer inception_5a/output
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/1x1
_bottom inception_5b/1x1
setup layer inception_5b/relu_1x1
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/3x3_reduce
_bottom inception_5b/3x3_reduce
setup layer inception_5b/relu_3x3_reduce
stride 1, 1
_bottom inception_5b/relu_3x3_reduce
setup layer inception_5b/3x3
_bottom inception_5b/3x3
setup layer inception_5b/relu_3x3
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/5x5_reduce
_bottom inception_5b/5x5_reduce
setup layer inception_5b/relu_5x5_reduce
stride 1, 1
_bottom inception_5b/relu_5x5_reduce
setup layer inception_5b/5x5
_bottom inception_5b/5x5
setup layer inception_5b/relu_5x5
_bottom inception_5a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5b/pool
stride 1, 1
_bottom inception_5b/pool
setup layer inception_5b/pool_proj
_bottom inception_5b/pool_proj
setup layer inception_5b/relu_pool_proj
_bottom inception_5b/relu_1x1
_bottom inception_5b/relu_3x3
_bottom inception_5b/relu_5x5
_bottom inception_5b/relu_pool_proj
setup layer inception_5b/output
_bottom inception_5b/output
kernel (7 7) pad (0 0) stride (1 1) global_pooling 0
setup layer pool5/7x7_s1
_bottom pool5/7x7_s1
setup layer pool5/drop_7x7_s1
_bottom pool5/drop_7x7_s1
----BlobInfo----
Shape in nchw (1000 1024 1 1)
----------------
setup layer loss3/classifier
_bottom loss3/classifier
setup layer prob
Output shape 256 28 28
Output shape 480 28 28
Output shape 512 14 14
Output shape 512 14 14
Output shape 512 14 14
Output shape 528 14 14
Output shape 832 14 14
Output shape 832 7 7
Output shape 1024 7 7
input 1024 1 1
----BlobInfo----
Shape in nchw (1 1000 1 1)
----------------
old bottom conv2/relu_3x3 to new bottom conv2/3x3
*old bottom conv2/relu_3x3 to new bottom conv2/3x3
+old bottom conv2/relu_3x3 to new bottom conv2/3x3
Erasing layer 8 conv2/relu_3x3
Layer 8 after erasing: conv2/norm2 type LRN
old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
*old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
+old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
Erasing layer 15 inception_3a/relu_3x3
Layer 15 after erasing: inception_3a/5x5_reduce type Convolution
old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
*old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
+old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
Erasing layer 28 inception_3b/relu_3x3
Layer 28 after erasing: inception_3b/5x5_reduce type Convolution
old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
*old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
+old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
Erasing layer 42 inception_4a/relu_3x3
Layer 42 after erasing: inception_4a/5x5_reduce type Convolution
old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
*old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
+old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
Erasing layer 55 inception_4b/relu_3x3
Layer 55 after erasing: inception_4b/5x5_reduce type Convolution
old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
*old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
+old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
Erasing layer 68 inception_4c/relu_3x3
Layer 68 after erasing: inception_4c/5x5_reduce type Convolution
old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
*old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
+old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
Erasing layer 81 inception_4d/relu_3x3
Layer 81 after erasing: inception_4d/5x5_reduce type Convolution
old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
*old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
+old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
Erasing layer 94 inception_4e/relu_3x3
Layer 94 after erasing: inception_4e/5x5_reduce type Convolution
old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
*old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
+old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
Erasing layer 108 inception_5a/relu_3x3
Layer 108 after erasing: inception_5a/5x5_reduce type Convolution
old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
*old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
+old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
Erasing layer 121 inception_5b/relu_3x3
Layer 121 after erasing: inception_5b/5x5_reduce type Convolution
input size 150528 parts size 150528
Forward
----------Prediction costs 1138.138452ms
Forward
Segmentation fault
未使用的局部变量
局部变量'proto' 在函数中没有被使用,建议删除未使用的局部变量
Line 42 in 1d469df
ConvolutionDepthwise 在模型转换阶段未做适配
ConvolutionDepthwise 在模型转换阶段未做适配,需要修改group为num_output
Model convert error - libprotobuf
I am trying to convert some caffe models (prototxt / caffe model upgraded) into feathermodel using tools/feather_convert_caffe, and hit this error.
[libprotobuf FATAL /usr/local/include/google/protobuf/repeated_field.h:1514] CHECK failed: (index) < (current_size_):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (current_size_):
Aborted (core dumped)
Note: I get this error when trying to convert any converting any caffe model . Also, I am using the experimental branch ( commit - 5023303 ).
Please, do you have any ideas for a solution?
Also, is there a input file for 3x227x227?
This is required for some models like AlexNet. Currently, there is only on input file - input_3x224x224.txt. Please, how is this file created.?
Evaluation
Hello, why didn't I find /build_ scripts/build_ linux_ test.sh ?
can not conver the bvlc_googlenet.caffemodel
Hello,thank you for your contributions for the FeatherCNN,I have a question that when I use ./feather_convert_caffe bvlc_googlenet.prototxt bvlc_googlenet.caffemodel to conver bvlc_googlenet.caffemodel, it gets something wrong:
Input Num 0
Input Layer
Input dim 10
Input dim 3
Input dim 224
Input dim 224
Layer num 0
Legacy layer num 169
[libprotobuf FATAL /usr/local/include/google/protobuf/repeated_field.h:1522] CHECK failed: (index) < (current_size_):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (current_size_):
Aborted (core dumped)
Is it wrong with my bvlc_googlenet.prototxt? I download the prototxt from https://github.com/BVLC/caffe/blob/master/models/bvlc_googlenet/deploy.prototxt and the caffmodel was downloaded from
http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel
ubantu16.04 64Bit Host
Looking forward to your reply!
Thanks in advance!
Model convert error
layer {
name: "input"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 3
dim: 22
dim: 92
}
}
}
if input layer prototxt is as above, will report bug as "Blob not setup yet, may be casued by wrong layer order. Aborted"
build error
./build_scripts/build_android.sh 在macos(ndkr17b)编译不通过有如下错误
make: *** No targets specified and no makefile found. Stop.
make: *** No rule to make target `install'. Stop.
请问是什么问题,需要如何解决?
build_linux.sh
Compiling and Install
cd FeatherCNN
./build_scripts/build_linux.sh
build_scripts 目录已经没有build_linux.sh这个文件了
Running benchmark with SqueezeNet model, segmentation fault occured
Platform: Hikey960 / Linux Debian 4.4.74 aarch64 GNU/Linux
Model: ./data/squeezenet.feathermodel from (http://hpcc.siat.ac.cn/jintao/feathercnn/)
Input data: ./data/input_3x224x224.txt
Description:
On running the benchmark test using squeezenet.feathermodel, i get a run-time error (Segmentation fault).
The issue occurs when a release (master branch) ahead of e8f2d95 is used.i.e it occurs in the next release "add implementation for reshape layer - d12e42b".
Please see logs below: -
root@debian:~/feather# ./feather_benchmark ./data/squeezenet.feathermodel ./data/input_3x224x224.txt 20 4
++++++Start Loader++++++
Finished loading from file
bottom name fire2/relu_squeeze1x1 ptr 0x557515ffc0
Output shape 128 55 55
bottom name fire3/relu_squeeze1x1 ptr 0x5575160a70
Output shape 128 55 55
bottom name fire4/relu_squeeze1x1 ptr 0x5575161520
Output shape 256 55 55
bottom name fire5/relu_squeeze1x1 ptr 0x5575178da0
Output shape 256 27 27
bottom name fire6/relu_squeeze1x1 ptr 0x55751904e0
Output shape 384 27 27
bottom name fire7/relu_squeeze1x1 ptr 0x5575190f90
Output shape 384 27 27
bottom name fire8/relu_squeeze1x1 ptr 0x5575191a40
Output shape 512 27 27
bottom name fire9/relu_squeeze1x1 ptr 0x557519cf40
Output shape 512 13 13
input size 150528 parts size 150528
Segmentation fault
Model Convert Errror
Hi, thank you for your hard works on Feather CNN. I need help to run Benchmark on different networks.
Seen this Benchmark result of different networks, mobilenet, squeezenet, googlenet and VGG16.
https://github.com/Tencent/FeatherCNN/wiki/Benchmarks
Can you share feathermodel file you used, please?
I've tried to convert some caffe models (prototxt / caffe model upgraded) into feathermodel using tools/feather_convert_caffe, and hits an error.
[libprotobuf FATAL /HDD/usr/local/include/google/protobuf/repeated_field.h:1431] CHECK failed: (index) < (current_size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (current_size_):
Aborted_
Note : MobileNet, and SqueezeNet hits the same error.
I've found an way to work around,
in feather_convert_caffe.cc:217
layer_num = net_param.layer_size(); change this to layer_num = net_param_prototxt.layer_size();
-> which makes the conversion works, and MobileNet benchmark runs without issues.
However, Squeezenet hits another issue on Benchmark Run.
++++++Start Loader++++++
Finished loading from file
feather_benchmark: /root/feather/src/flatbuffers/flatbuffers.h:242: flatbuffers::Vector::return_type flatbuffers::Vector::Get(flatbuffers::uoffset_t) const [with T = flatbuffers::Offsetfeather::BlobProto; flatbuffers::Vector::return_type = const feather::BlobProto*; flatbuffers::uoffset_t = unsigned int]: Assertion `i < size()' failed.
Aborted
I guess this comes from layer_num mismatches where I've changed the layer_num to prototxt's. But if I didn't, conversion to feathermodel would fail in first place.
Can you shed some lights on this, please?
Another,
How do I create input data file, please? repo has input_3x224x224, but shouldn't alexnet use input_3x227x227? Not sure where the input file comes from, so cannot create one for 227x227. Missing some points maybe?
Thanks!
Mobilenet Forward twice, result is wrong
As title
OpenCL or Vulkan port?
Hi, Is there yet a OpenCL or Vulkan port for the library ?
Thanks in advance.
Comparison with ncnn?
Hi, I am new to ncnn and featherCNN, so could you please give me some introdution about the difference between these two framwork? thank you!
Dead loop during net init fuse stage
Info:
- Conv A (top blob a, bottom blob x)
- BatchNormal B (bottom blob a, top blob a)
- Scale C (bottom blob a, top blob a)
up case code will run into dead loop
sgemm.cpp:block_sgemm_external_pack_threading_8x8, block_sgemm_external_pack_threading
unsigned int tN = N / num_threads / factor;
tN = (tN + 7) & 0xFFFFFFF8;
for example:
if N = 26 && num_threads = 3
tN = 8;
thread task:8 8 8
but left 26-24= 2
不支持反卷积吗?为什么./feather_convert_caffe 转换caffe模型没有报错
./feather_convert_caffe 转换caffe模型没有报错,生成了feathermodel文件。
可是加载时,就crash
请问是否支持ssd?
Did this framework support ssd detection network?
Documentation or examples for ARM usage.
Hi,
first of all kudos to fantastic developers like you guys... I am very excited to use this library. I plan to use it on an embedded system but I can't get past the basic compiling of ./build_scripts/build_linux.sh
I mean, it would be very nice if you could redirect me to a use case of other resource for using the library on Embedded OS. :)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.