There seems to be an incoherence which prevents me from loading the alexnet weights. I

Can't convert reference_caffe. about caffe-tensorflow HOT 7 CLOSED

ethereon commented on May 12, 2024

Can't convert reference_caffe.

from caffe-tensorflow.

Comments (7)

ethereon commented on May 12, 2024

This is due to the grouped convolutions that are used in the original AlexNet (and Caffe's implementation of it). Notice the group: 2 parameter over here.

Grouping appears to be a historical artifact. I don't believe I've seen it outside of the original AlexNet implementation. Krizhevsky's own subsequent version in the "One Weird Trick" paper discards the grouping. The current implementation of caffe-tensorflow ignores it as well.

For reference, these are the output blob shapes for AlexNet produced by Caffe:

 #  Name                          W      H      C
-------------------------------------------------
 1  data                        227    227      3
 2  conv1                        55     55     96
 3  norm1                        55     55     96
 4  pool1                        27     27     96
 5  conv2                        27     27    256
 6  norm2                        27     27    256
 7  pool2                        13     13    256
 8  conv3                        13     13    384
 9  conv4                        13     13    384
10  conv5                        13     13    256
11  pool5                         6      6    256
12  fc6                           1      1   4096
13  fc7                           1      1   4096
14  fc8                           1      1   1000
15  prob                          1      1   1000

and the corresponding parameter shapes:

 #  Name                                Out        In       W    H       Total
------------------------------------------------------------------------------
 1  conv1                                96         3      11   11       34848
 2  conv2                               256        48       5    5      307200
 3  conv3                               384       256       3    3      884736
 4  conv4                               384       192       3    3      663552
 5  conv5                               256       192       3    3      442368
 6  fc6                                4096      9216       -    -    37748736
 7  fc7                                4096      4096       -    -    16777216
 8  fc8                                1000      4096       -    -     4096000

from caffe-tensorflow.

leconteur commented on May 12, 2024

I'm sorry, I don't understand how you can import the weights produced by caffe if we can't use the group parameters. When I try to import them, I have the following error:
ValueError: Dimensions Dimension(96) and Dimension(48) are not compatible

This is normal, since I don't have the same size for the weights. The output of the first convolution is 96, and the input of the second convolution is 48.

Is it something that makes the importation impossible? If so, is there a relatively small pretrained equivalent?

from caffe-tensorflow.

ethereon commented on May 12, 2024

As I mentioned, this is due the group parameter. The sizes are actually consistent because the convolution is performed by splitting the input tensor along the depth into 2 groups first. So, this gives you two 55 x 55 x 48 volumes. You further group the convolution filters into 2 groups as well, giving you two 27 x 27 x 48 x 128 filters. You then convolve each group separately and merge them back together to obtain the 27 x 27 x 256 output.

Grouped convolutions are supported by Caffe, but (AFAIK) not by TensorFlow. You'd have to implement it yourself. The exported parameters, however, are correct.

from caffe-tensorflow.

leconteur commented on May 12, 2024

I think I can manage to make a simple pull request to implement this in the Network class. I will keep you posted when I succeed.

from caffe-tensorflow.

ethereon commented on May 12, 2024

I had some free time today, so I went ahead and implemented support for grouped convolutions. Everything should work correctly now.
Tested on AlexNet and CaffeNet.

from caffe-tensorflow.

davheld commented on May 12, 2024

Out of curiosity, how does the performance of AlexNet change when you add / remove grouped convolutions? Which version has higher accuracy on ImageNet classification, for example?

from caffe-tensorflow.

ethereon commented on May 12, 2024

I haven't evaluated AlexNet/CaffeNet without grouped convolutions, so I have no hard data.

I believe the original motivation for grouping in Krizhevksy's paper was to allow for splitting the training across two GPUs. Interestingly though, the split resulted in specialization across the groups (see section 6.1: one group was color-agnostic, the other was color-sensitive).

I recently came across this paper which claims to use grouped convolutions and achieve similar or higher accuracy with fewer parameters: http://arxiv.org/pdf/1605.06489.pdf

from caffe-tensorflow.

Can't convert reference_caffe. about caffe-tensorflow HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent