yonghenglh6 / depthwiseconvolution Goto Github PK

View Code? Open in Web Editor NEW

526.0 30.0 183.0 16 KB

A personal depthwise convolution layer implementation on caffe by liuhao.(only GPU)

C++ 24.34% Cuda 41.68% Python 33.97%

mobilenet

depthwiseconvolution's Introduction

Depthwise Convolutional Layer

Introduction

This is a personal caffe implementation of mobile convolution layer. For details, please read the original paper:

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

How to build

Merge the caffe folder in the repo with your own caffe.
```
$ cp -r $REPO/caffe/* $YOURCAFFE/
```
Then make.
```
$ cd $YOURCAFFE && make
```

Usage

Replacing the type of mobile convolution layer with "DepthwiseConvolution" is all. Please refer to the example/Withdw_MN_train_128_1_train.prototxt, which is altered from

MobileNet-Caffe

GPUPerformance on example net

GPUPerformance	Origin¹	Mine
forward_batch1	41 ms	8 ms
backward_batch1	51 ms	11 ms
forward_batch16	532 ms	36 ms
backward_batch16	695 ms	96 ms

Transfer normal net to mobilenet

I write a script [transfer2Mobilenet.py] to convert normal net to mobilenet format. You may try too. Usage:

python ./transfer2Mobilenet.py sourceprototxt targetprototxt [--midbn nobn --weight_filler msra --activation ReLU]    ["--origin_type" means the depthwise convolution layer's type will be "Convolution" instead of "DepthwiseConvolution"]

The "transferTypeToDepthwiseConvolution.py" will be used for changing the depthwise convolution layer's type from "Convolution" to "DepthwiseConvolution".

When turn on cudnn, the memory consuming of mobilenet would increase to unbelievable level. You may try. ↩

depthwiseconvolution's People

Contributors

Stargazers

Watchers

Forkers

camellia89 camel007 gjtjx davidsunny86 phenixi chankent trantorrepository lakehui wang-mengjiao zack6514 tonychouzju xqpinitial swearos fireeyesgit zgsxwsdxg hityzy bkong1990 vvvdappa fatherofham 6676401088 justdolearning mynameischaos dengshuo wanjinchang merlinwu lqs19881030 rickchen147258 gitkeidy zhaokai5 eshenxd jaminp sophiealex inachencyr runauto keyboardless tanfluent linwenzhao pengwubj mypolarbear whydazhou yihaiduan herocodemaster haorand l1129433134 neuleaf peternara dreadlord1984 guo253 tigercouple facegen stormkingz honghucode xialuxi starstylesky jereo tsingjinyun lawrencewxj jiayanyuan baileyqbb lvpchen smartmachinebay aiwensi waiting111 xiaotie1005 zhangbinggang objectdetection sxq2004123 abc1225 jnulzl l1212s ckrunauto solarleisu walkoncross zhleternity yaoling13 wjgaas yangyangl runningj lippman1125 fendaq legolas123 shaohuawan vincentgu11 yezilove ouadakarim qiaoptdun cysin yiran-thu zzuxzt dugusiqing yhkim8412 wander2017 fireae suruoxi sunjunlishi luke-evans-liu shlpu 871699406 eric612 csgaobb

depthwiseconvolution's Issues

How about proto file changes?

Dear Liu,
Thank you for your work.

I can not find changes in caffe.proto file. Can you provide this too?
there are compile issues when use the codes. multiple variables do not defined. I am not sure if missed some define in .h file...

And, I found comments in the codes, this is only for stride 1?

About group in deploy.prototxt

I found "group" parameter in your deploy.prototxt and should I use this during the training process or keep it ? How about the accuracy you got ? I trained model only to get 54% accuracy top 1.

Why slower than tensorflow

In Mobilenet-tensorflow (https://github.com/Zehaos/MobileNet), The speed in GTX-1080 is only 3ms, but caffe takes 8ms.

What makes the difference between caffe and tensorflow?

thanks

Caffe (Windows) - version, build, usage?

Dear Editor:
Which version of caffe have you used? How to add your code to caffe source?

Number of parameters not reduced

I use your code to train my net, and I found the dw layer's parameter is same with normal convolution. do you change the blob of weight size when create DepthwiseConvolution layer?

group convoltion

您好，我大致看了一下代码，您实现的应该是：输入的feature map有多少channel，卷积就有多少group，不知道我的理解正确吗？

Unknown layer type: DepthwiseConvolution

Hello, I added your .hpp/.cpp/.cu files to my caffe folders accordingly and compiled successfully(I didn't modify any other files). But when I try to train my mobilenet model, it always prompted " Unknown layer type: DepthwiseConvolution". Could you please tell me where the problem is? By the way, my caffe runs on windows 7.

About group number error

Thanks for your job！
But I found that if (output channels / groups) != 1 and equals to other integer, the net couldn't work.
e.g. If input channels = 32, groups = 32 and output channels = 64, the loss of net will not decrease. In Imagenet1000 training, the loss is still at 6.9.
Do you know how to solve this problem?
Thx!

Which GPU did you run the test with?

Hi:
Thanks for the effort in writing customized DepthwiseConvolution layers.
I have successfully merge them with my caffe.
Using my TitanX(Maxwell), I got 20ms for a 224*224 image inference with batch=1.
But, I noticed that you have achieved 8ms.
Would you please tell me what GPU you tested with?
Thank you very much

Alex

Does the DepthwiseConvolutionLayer need the parameter(group) ?

i find the parameter(group) in DepthwiseConvolution layer of depthwised_mobilenet_deploy.prototxt
Is it still useful?

Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: DepthwiseConvolution

I use the latest version of caffe
Just copy three files and make?
Do I need to modify caffe.proto and layer_factory.cpp?

only for stride 1？

Except for the speed, does it perform like the original caffe convolution?

I know it can run faster than original conv, but I find all the params are available in original caffe convolution, so if I just replace DepthwiseConvolution with Convolution, do they have same performance?
Or if depthwiseConvolution does something else?
Thanks very much

Do you have implemented the "cpp" file of DepthwiseConvolution for CPU?

You maybe only have implemented the layer for CUDA, but the implementation of CPU is still only the original "Caffe's conv+group"?

The same issue is here: https://github.com/Zehaos/MobileNet/issues/22

I wonder if you will implement the DepthwiseConvolutionLayer for CPU? Any contribution will be grateful!

Best.

与convd的不同

你好，请问在使用engine：CAFFE时，DepthwiseConvolution与conv的实现方式一模一样是吗，贡献是实现了group时可以使用cuda吗？还有在添加你的层入我的caffe时不需要添加进caffe.proto吗？

Movidius ncsdk not support this

NCSDK v2.05
u16@u16-System-Product-Name:~/work/realtime-object-detection$ mvNCCompile models/MobileNetSSD_deploy.prototxt -w models/MobileNetSSD_deploy.caffemodel -s 12 -is 300 300 -o graphs/mobilenetgraph
/usr/local/bin/ncsdk/Controllers/Parsers/TensorFlowParser/Convolution.py:45: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert(False, "Layer type not supported by Convolution: " + obj.type)
mvNCCompile v02.00, Copyright @ Intel Corporation 2017

[Error 4] Toolkit Error: Stage Type Not Supported: DepthwiseConvolution

about some details

Thanks for your job！
But I have some questions to ask you , I hope you can help me.

Merge the caffe folder in the repo with my own caffe, and replace the type of dw layer to DepthwiseConvolution in deploy.prototxt , test speed is faster. But when I train the caffemodel, i use convlution rather than DepthwiseConvolution, that means DepthwiseConvolution is the same as convlution? And I don't add DepthwiseConvolution layer to caffe.proto,why it can work?
2.when I train the mobilenet ,I ues DepthwiseConvolution,but the speed of train is also slow ,because the caffe.proto doesn't have DepthwiseConvolution layer ?

windows caffe编译

你好，请问windows下面的caffe，是拷贝这个仓库caffe文件目录下面的程序到我电脑上面caffe相应的文件夹下面，然后重新编译一下就可以了吗？

train slowly

利用您的脚本我把vgg16转换成了mobilenet的形式，caffe也重新编译了。但是训练速度很慢，大概是4分钟20次。显卡是TITAN X (Pascal)。环境是，ubuntu14.04，cuda8.0,cudnn5.1
这是我的训练设置：
test_iter: 5000
test_interval: 5000
base_lr: 0.01
display: 20
max_iter: 300000
lr_policy: "poly"
power: 1
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "models/vgg16_"
random_seed: 0
net: "mobilenet.prototxt"
test_initialization: false
iter_size: 16
solver_mode: GPU

just a copy of conv

It's just a copy of the caffe convolution layer with name changed to"depthwise"... Amusing it got so many stars~

About the benchmark device

hi @yonghenglh6 ,
which gpu do you use for the benchmark time of depthwise conv?

is this implementation support cudnn？

Dose the restriction 'only for stride1' still hold?

Thanks for your code and sharing.
In the very beginning lines in .cu file there are some notes saying

/*
 * The depthwise layer for mobilenet.   only for stride 1
 */

However when I quickly look through the code, the case 'stride != 1' seems to be already considered and handled.
In case of some errors, I'm writing to ask whether I can use stride>1 in this layer now?
Thanks!

Did somesone train some models with this DepthwiseConvolution layer?

@yonghenglh6 Thanks for your good work! I trained caffe model by using your DepthwiseCovolution layer as following your suggestion, while it works when I train the model, I met some problems . When I deploy the model I trained , there was a error : Check failed registry count (type) == 1 (0 VS 1) Unknown layer type: DepthwiseConvolution.
So , did you meet this error? or Have you deployed some caffe model with DepthwiseConvolution?
Looking forward your attention. Thanks

affter add the 3 piece into caffe file, when make caffe, crt1.o error happen, how to solve it?

when I make cope the depthConv_layer to the caffe root, when I make caffe, always the crt1.o happen, the key error is display as below:
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o: In function _start': (.text+0x20): undefined reference to main'
collect2: error: ld returned 1 exit status

any suggetion how to solve it?

the solver.prototxt

Can you share your solver.prototxt ?
I only get 43% val acc when the epoch is 200000.
Mine is
net: "train_val.prototxt"
test_initialization: false
test_iter: 100
test_interval: 5000
display: 100
average_loss: 40
base_lr: 0.01
lr_policy: "poly"
power: 1.0
max_iter: 1000000
momentum: 0.9
weight_decay: 0.0001
snapshot: 5000
snapshot_prefix: "models/mobilenet"

Unknown layer type: DepthwiseConvolution

When i try your DepthwiseConvolution layer, i got the error:Unknown layer type: DepthwiseConvolution, why? And should i change caffe.proto？

which vison of caffe did you use? When I use it on my caffe ,it got wrong.

When make,it turns out:
src/caffe/layers/depthwise_conv_layer.cpp:31:46: error: ‘class caffe::DepthwiseConvolutionLayer’ has no member named ‘bottom_dim_’
this->forward_cpu_gemm(bottom_data + n * this->bottom_dim_, weight,
^
src/caffe/layers/depthwise_conv_layer.cpp:32:24: error: ‘class caffe::DepthwiseConvolutionLayer’ has no member named ‘top_dim_’
top_data + n * this->top_dim_);
^
src/caffe/layers/depthwise_conv_layer.cpp:35:45: error: ‘class caffe::DepthwiseConvolutionLayer’ has no member named ‘top_dim_’
this->forward_cpu_bias(top_data + n * this->top_dim_, bias);
^
src/caffe/layers/depthwise_conv_layer.cpp: In instantiation of ‘void caffe::DepthwiseConvolutionLayer::Backward_cpu(const std::vector<caffe::Blob>&, const std::vector&, const std::vector<caffe::Blob>&) [with Dtype = float]’:

error == cudaSuccess (77 vs. 0 ) an illegal memory access was encountered

Did someone meet the problem ,called error == cudaSuccess (77 vs. 0 ) , when I changed the original convolution to the format of the depthwise convolution,and run it , I got the math_functions.cuL79 ] check failed: error == cudaSuccess (77 vs. 0 ) an illegal memory access was encountered. However, when I used the CAFFE TIME to test the same prototxt(the depthwise convolution prototxt), it worked and cumulated the network's time. @ @yonghenglh6 Thanks

Unknow layer type: DepwiseConvolution

Hi, I've placed those threes files under the corresponding folders. Then I go to caffe root dir to execute make clean && make -j4 all. But when I ran my network, it says:
I1114 07:16:55.252534 10776 layer_factory.hpp:77] Creating layer conv2_1/dw F1114 07:16:55.252562 10776 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: DeptwiseConvolution (known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Concat, ContrastiveLoss, Convoltion, Crop, Data, Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDFOutput, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultinomalLogisticLoss, PReLU, Parameter, Pooling, Power, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEnropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, Tile, WindowData) *** Check failure stack trace: *** Aborted
Is there any possible reasons? Or are there extra files to modify? thanks a lot