antingshen / resnet-protofiles Goto Github PK

View Code? Open in Web Editor NEW

216.0 216.0 245.0 110 KB

Caffe Protofiles for MSRA ResNet: train prototxt

Python 100.00%

resnet-protofiles's People

Contributors

Stargazers

Watchers

Forkers

lcy4869 baiyancheng20 hal2001 forresti beijinggao wuqixiaobai nuanxinqing koizora223 jren2017 minahpark bryantlj gaffey soledad89 jaciyu jianweilin algpower willamezhang jay2002 shengyudingli rotorliu hdy007007 zhengshou guolu412391743 dimplesl giorking zbxzc35 hubery94 jacoblee121 sallymmx yexinyan x0chen sixsixsix666 my777777 mengqhui coderx7 luolongqiang dasona sysuzyq zenozhouzhao yuanh1990 ucasqcz laizizhandi lnhust dingfg beprominent szlcw magicsen hlqzc2008 humengdoudou december-boy jade999 yohoho233 aojunzhou gzhermit donglinjy fleapo jiangwqcooler leehungxd walkoncross inkimage yougoforward carabob laoma023012 bigking2017 chanyn superxiaoying weizequan happyheart866 mengmengmiao tzhang2014 shiyang07 patrickket haiminzhang shichaosuper xu14304186 cybersin mqchen1993 tracycuiq sosoho xiamenwcy qionghuah superhero1991 sunting78 guozhongluo pingool kli017 suntim boosting wxbxj wanglaotou twoyoungtwosimple zn787726661 shahnawazgrewal sidkashyap legolas123 ruyiwei-cas mincore sifengyang suckeedting shaoxq

resnet-protofiles's Issues

I want to ask your GPU memory using condition

I trained resnet-50 net using your prototxt, and I found with batchsize=16 it used 6707MB GPU memory.
Is this condition normal?
I thought the GPU memory occupied is larger than I expected.

Ask for resnet34.caffemodel

Sorry to bother you,I trained the resnet34 on Imagenet with the shortest side is 256, but the accuracy is only 67%. I can't reach 73% of the paper. Can you share your resnet34.caffemodel? Thanks!

The bias_term of conv layer

In your *.prototxt, the conv layer:
weight_filler {
type: "msra"
}
bias_term: false
I don't understand why set bias_term as false, which is rarely seen in the literature.

In many cases,
bias_filler {
type: "constant"
value: 0
}.

I have been training ResNet-50 from scratch using your train_val file, but my training is overfitting to an extent that my training accuracy is 100% (both top1 and top5), where as my testing accuracy (on 50000 validation images) is less than 10% (4% top-1 and 15% top-5)

I1008 23:39:35.303689  5233 solver.cpp:331] Iteration 92000, Testing net (#0)
I1008 23:39:54.131199  5233 solver.cpp:398]     Test net output #0: acc/top-1 = 0.047
I1008 23:39:54.131352  5233 solver.cpp:398]     Test net output #1: acc/top-5 = 0.158
I1008 23:39:54.131367  5233 solver.cpp:398]     Test net output #2: loss = 5.72002 (* 1 = 5.72002 loss)
I1008 23:39:54.279186  5233 solver.cpp:219] Iteration 92000 (1.63045 iter/s, 24.533s/40 iters), loss = 0.0226289
I1008 23:39:54.279218  5233 solver.cpp:238]     Train net output #0: acc/top-1 = 1
I1008 23:39:54.279225  5233 solver.cpp:238]     Train net output #1: acc/top-5 = 1
I1008 23:39:54.279233  5233 solver.cpp:238]     Train net output #2: loss = 0.0249973 (* 1 = 0.0249973 loss)

Could this be because of lack in data augmentation as I don't see any random crop or horizontal flipping happening on training lmdb data?

iter_step in solver.prototxt

In the included solver.prototxt, I see: iter_step: 2.

I grepped caffe-master and didn't find iter_step anywhere. Is this meant to be iter_size? Or, is something else going on?

resnet50 deploy fail

I was trained Network by ResNet_50_train_val.prototxt ,but use caffemodel fail .arget_blobs.size() == source_layer.blobs_ size() (2 vs. 1) Incompatible number of blobs for layer conv1

Thanks for your work on resnet-protofiles

Thanks very much. Your repo about resnet-protofiles help me out after struggling one week.

[Bug] resnet-50 train_val.prototxt 1st Conv Layer BlobSize mismatch

According to Kaiming's caffemodel, the first convolutional layer of resnet-50 contains bias_term. Hope to fix it.

the output of bn3b1_branch2b is inf

when i rewrite your code ResNet-101 for semantic segmentation (just changed the last several layers ),and train from scratch or fine-tune with Kaiming He's caffemodel, it seems like this:
[Forward] Layer bn3a_branch2c, top blob res3a_branch2b data: inf

could you give me some suggestions, thx!

Training from scratch

Have you ever trained a model using your code? I tried to train a new model, but did not achieve the accuracy.

train output loss diffurent from the test output loss

hi
At first . thanks for your code . but I have a problem when i train with the ResNet_50_train_val.prototxt . Iteration 2w. Train output loss diffurent from the test output loss. the test loss was 2.0+ . what i have changed is the batch_size : I change it to 28 because of the memoray. others were same to your ResNet_50_train_val.prototxt and solver.prototxt . hope for your replying.

Where can I find pre-trained weights files?(for caffe)

Excuse me,is there any place I can find a pre-trained .caffemodel file for doing some finetuning work?
I want to do my experiment base on resnet-101 but I can still not find a suitable weights file for caffe.
Thank you so much.

Is there really a bias term in the first conv of resnet50?

Hello,
Thank you very much for the protofiles. But I am curious that why you set a bias term in the first conv in resnet50, since it's before a bn layer. In bn layer, all the inputs will be substracted the mean and divided the variance. So it seems unnecessary to have a bias term in first conv before bn as all the others conv layers do exactly.
:)

Directory structure

When training many different CNNs, I find it easier to organize the directories like this:

ResNet_18/train_val.prototxt
         /deploy.prototxt

ResNet_34/train_val.prototxt
         /deploy.prototxt

...etc

If I reorganize the code this way, would you accept such a PR?

fine tuning error

I'm using your prototxt to fine tune ResNet50 with the Kaiming He's caffemodel,but get an error:
" Check failed: target_blobs.size() == source_layer.blobs_size() (1 vs. 2) Incompatible number of blobs for layer conv1 "

do you know how to solving it ?

Batch normalization layer

In your *.prototxt, all batch norm layers;
batch_norm_param {
use_global_stats: false
}
But there is a detailed description http://caffe.berkeleyvision.org/tutorial/layers/batchnorm.html
"By default, it is set to false when the network is in the training phase and true when the network is in the testing phase."
Therefore, in your code, you set use_global_stats to false for both training and testing phase. Which is better?