Comments (2)
Hi! Thank you for your attention to our work. Actually we discovered in our experiments that, large learning rate is very important for training binary detectors. And we guess it's because large lr can avoid local minima for the binary neural network. In addition, if you use large lr, you need to use a large batch size to stabilize the training of the network. This might be why your loss goes to NaN when using batch_size=16 and lr=1e-3.
I also tried lr=1e-4 with batch_size=32 before, indeed the training process is more stable, but the performance is bad. If I remember correctly, we achieved ~60% mAP on VOC using this lr scheduler (lr starts from 1e-4 and decay to 1e-5 and 1e-6 when the loss stops decreasing).
So, from my perspective, I recommend you to use multiple GPUs to train on batch_size=32, because large initial lr is very important if you want to get good performance. If you can't find a way to train on large batch_size, maybe you can use some tricks like lr warmup to stabilize the early stage of training. But I'm sure whether the mAP will be as good as reported in our paper.
from bidet.
Thank you very much for your explanation, I will try again!
from bidet.
Related Issues (20)
- bidet测试效果不好 HOT 1
- loss为inf HOT 1
- test error HOT 1
- params HOT 3
- Vgg arch in SSD implementation is different from original vgg HOT 2
- 数据集划分问题 HOT 10
- 有关于模型在coco数据集上的表现 HOT 8
- 关于计算量和参数量 HOT 2
- 关于训练 HOT 2
- 关于二值化带来的参数缩减 HOT 4
- 关于检测头的二值化 HOT 6
- 关于检测头二值化问题的请教 HOT 7
- 关于faster rcnn的FPN HOT 1
- Resnet18的layer中存在未二值化的卷积 HOT 2
- 在其他数据集上训练 HOT 1
- faster rcnn训练路径
- VOC数据集结果复现
- IB准则
- None of the weights are binarized HOT 1
- 预训练 HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bidet.