ziweiwangthu / bidet Goto Github PK
View Code? Open in Web Editor NEWThis is the official pytorch implementation for paper: BiDet: An Efficient Binarized Object Detector, which is accepted by CVPR2020.
License: MIT License
This is the official pytorch implementation for paper: BiDet: An Efficient Binarized Object Detector, which is accepted by CVPR2020.
License: MIT License
Hi, I am working on recreating your results but I cant seem to find the model setting for Bidet-sc which you have reported in the paper. Can you please point me to the file where I can see the skip connections you have built in Bidet-SC version of Bidet? THank you.
博士,您好
看您的介绍像是国人,如此我就直接用中文表达我的疑惑了,望请解答。
请问BiDet中涉及到的模型在coco数据上能否达到yolov3在coco数据集上的效果,即输入相近的Map值。
例如在coco数据上达到30左右的Map值。
我看您在此仓的部分issue中有提到一个基于Resnet18的模型,其Map值为14.4,且论文中展示的在coco数据集上的Map为15.7,请问这些值是此仓的模型的最理想值吗,能否有提升的空间,即还能否高
祝好
Do you have any plans to release the code of the Auto-Bidet?
I'm very interested in it!
Thank you very much!
(base) fsr@3090:~/BiDet-master$ python ssd/train_bidet_ssd.py --dataset='VOC/COCO' --data_root='path/to/dataset' --basenet='path/to/pretrain_backbone'
Traceback (most recent call last):
File "ssd/train_bidet_ssd.py", line 414, in
train()
File "ssd/train_bidet_ssd.py", line 123, in train
ssd_net = build_bidet_ssd('train', cfg['min_dim'], cfg['num_classes'],
UnboundLocalError: local variable 'cfg' referenced before assignment
如何解决?
Hi, I really appreciate your valuable work.
I just wonder which part of the SSD network is binarized.
In your paper, SSD model consists of a backbone network and a detection network.
Are both parts are binarized?
If not, which layers have to be kept with full precision?
It's nice to see your work,If i want to train on my own dataset,which parts should i modify
Hi! Have you restructured your network of SSD300_Vgg16?
Dear authors,
How long does your model need to train?
我有注意到您提及通过thop库来计算参数量和计算量。
我的理解是,这个库上没有实现自定义模块的计算方式,您二值化后的模块是如何计算的呢,这部分的代码有开源吗?很期待能够学习。
您在训练代码时,只用到了训练集和测试集,未使用验证集(训练过程没有评估结果),是这样吗?
I've sent an email to the authors roughly 2 weeks ago about the pretrained models but haven't got a response yet (maybe the email didn't go?) so I'm re-iterating the question here.
For obtaining the pretrained weights of ResNet18/VGG16, do you train the networks as a floating point network or do you binarize the networks like in XNOR-Net/Bi-Real Net and then train them on ImageNet to obtain the pretrained weights?
I'm trying to use different backbone networks and an answer to this question would help me in obtaining the pretrained weights for my networks.
Hello, I encountered a problem.
During the training phase, due to the small GPU memory problem, I changed batch_size in train_bidet_ssd.py to 16 and num_worker to 4, and lr remained at the default 1e-3.
However, during the training process, the loss output will be NaN around 30,000 iterations, but when I evaluate the model parameters saved before becoming NaN, the output AP and mAP are both around 0.0xx.
Moreover, when I reduce the default learning rate to 1e-4, although the model loss will not have the problem of nan, it cannot obtain the correct evaluation result.
May I ask if I have unfinished configuration operations and how I will solve this problem.
Hi Guys, thank you for the awsome implementation. I had a question about the bidet_vgg in the ssd implementation. Your bidet_vgg has got extra layers and you have replaced all the maxpool layers with Conv layers with downsample. Why did you guys make these changes? Is the original vgg architecture too low in complexity to learn if you binarize it?
博士您好!
关于fasterrcnn的bbox_head,我尝试在自己的代码上进行了二值化,却导致了精度为零,我还原了这一部分为全精度,确认是二值化导致的。我想请问,在您的代码里面,bbox的最后两层shared_fcs层您进行二值化了吗?具体来说,是以下结构:
(shared_fcs): ModuleList( 13.896 M, 29.052% Params, 13.894 GFLOPs, 54.616% FLOPs, (0): Linear(12.846 M, 26.857% Params, 12.845 GFLOPs, 50.494% FLOPs, in_features=12544, out_features=1024, bias=True) (1): Linear(1.05 M, 2.194% Params, 1.049 GFLOPs, 4.122% FLOPs, in_features=1024, out_features=1024, bias=True) )
Hi,
I am trying to train faster rcnn on the coco dataset. I have downloade the coco dataset following the script that you have provided in the repo. WHen I follow the instructions and start training the model, I get the following error :
`Traceback (most recent call last):
File "faster_rcnn/trainval_net.py", line 222, in <module>
imdb, roidb, ratio_list, ratio_index = combined_roidb(args.imdb_name)
File "/media/Rozhok/BiDet/faster_rcnn/lib/roi_data_layer/roidb.py", line 119, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "/media/Rozhok/BiDet/faster_rcnn/lib/roi_data_layer/roidb.py", line 119, in <listcomp>
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "/media/Rozhok/BiDet/faster_rcnn/lib/roi_data_layer/roidb.py", line 112, in get_roidb
imdb = get_imdb(imdb_name)
File "/media/Rozhok/BiDet/faster_rcnn/lib/datasets/factory.py", line 38, in get_imdb
return __sets[name]()
File "/media/Rozhok/BiDet/faster_rcnn/lib/datasets/factory.py", line 31, in <lambda>
__sets[name] = (lambda split=split, year=year: coco(split, year))
File "/media/Rozhok/BiDet/faster_rcnn/lib/datasets/coco.py", line 39, in __init__
self._COCO = COCO(self._get_ann_file())
File "/home/biometrics/.virtualenvs/retinanet/lib/python3.6/site-packages/pycocotools/coco.py", line 84, in
__init__
with open(annotation_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory:
'/home/biometrics/data/coco/annotations/instances_valminusminival2014.json'
I dont have the instances_valminusminival2014.json in my annotations folder. Where can I get this from?
Sorry to bother you. I use faster-rcnn to test COCO. The mAP is 12.7%, which is lower than the result in the paper (15.7%);
The parameters I used are the default parameters in the trainval_net.py. How many epochs do you choose for training? Thanks for your time. I am waiting for your reply.
`
def _make_layer(self, block, planes, blocks, stride=1, **kwargs):
downsample = None
if stride != 1 or self.inplanes != planes * block.expansion:
conv = nn.Conv2d
ds_out_planes = planes * block.expansion
downsample = nn.Sequential(
nn.AvgPool2d(2, stride=stride, ceil_mode=True),
conv(self.inplanes, ds_out_planes, kernel_size=1, stride=1, bias=False),
nn.BatchNorm2d(ds_out_planes)
)
layers = []
layers.append(block(self.inplanes, planes, stride, downsample, **kwargs))
self.inplanes = planes * block.expansion
for _ in range(1, blocks):
layers.append(block(self.inplanes, planes, **kwargs))
return nn.Sequential(*layers)
`
博士您好,如上面这段代码,的第四行,该卷积未被二值化,想请教一下这是为什么呢?
Through this issue, I've fixed the problem with the prior/reg loss weights as per the author's response (add 1e-6 to avoid divide by zero).
However, I noticed that my loc_loss and reg_loss became NaN.
I retried with clipping the gradients by setting the --clip_grad option as True.
My loc_loss and reg_losses still became NaN at 150k iteration and the training failed.
The exact command I ran was the following:
python ssd/train_bidet_ssd.py --dataset VOC --data_root ./data/VOCdevkit/ --basenet ./ssd/pretrain/vgg16.pth --clip_grad true
Any help would be appreciated.
您好,我在编译setup.py时,遇到了error: torch/extension.h: No such file or directory这个错误?我想请教您这个问题要如何解决
请问一下识别一张图像,耗时多少呢?
Dear Author,
I'm sorry to bother you .
I noticed that the BN(Batch Norm) is train mode in the training process of SSD. The four parameters of BN (alpha, beta, meaning, variance) is update with the Binary neural network.
Do I understand correctly?
Thank you very much.
用bidet ssd原参数,将args.reg_weight = 0 args.prior_weight = 0,其他参数不变,在voc数据集上测试,得到的mean ap只有零点几,是我在训练时哪里出现了问题???
Hi, Wang. Thanks for your great work. I have some doubts,Can you explain the following doubts?
The IB loss in the paper is replace MC sampling int the code.
1.What is the meaning of MC sampling? Is it Monte Carlo (蒙特卡洛).
2.What is the relationship between MC sampling and IB principle?
3.The feature maps are L2 normalization in the code。What is the relationship between L2 normalization and MC sampling?
Thank you
Thanks for providing your source code.
My problem is that I use the default training configuration in the Faster-RCNN/trainval_net.py, which trains model 50 epochs and decay the learning rate every 6 epochs. However, I only achieved 17.18% mAP on test set.
How many epochs do you choose for training? Thanks for your time. I am waiting for your reply.
您好,关于检测头的二值化,我有以下疑问希望能够请教:
1.在RPN_head中,我注意到您除了将第一层卷积二值化了,还修改了proposal_layer的方式,代码如下:
# define proposal layer
self.RPN_proposal = _ProposalLayer_IB(self.feat_stride, self.anchor_scales,
self.anchor_ratios, self.sample_sigma)
想请问一下这里的修改的作用是什么吗,它对应的是论文里提及的IB准则吗?
2.关于检测头的二值化,按照我对代码的理解,RPN_Head和Roi_Head中,仅有RPN_Head的第一层被二值化了,这样的理解对吗?
非常感谢!
Hello, I encountered an error when executing the test command after the training was completed. The error message was: PermissionError: [Errno 13] Permission denied:'/path'. Can you give a solution? thank you very much!
Hi, I am trying to repeat the experiment in your paper, and meeting some problems in training part. I use the recommended command [python ssd/train_bidet_ssd.py --dataset="VOC" --data_root="D:\01DL\data\VOCdevkit" --basenet="D:\01DL\BiDet-master\ssd\pretrain\vgg16.pth"] to run the training and get the final model saved as " VOC_final.pth". However, the "VOC_final.pth", which I think is the parameters of the trained "BiDet_SSD model", is around 127MB. According to the paper, the output binary model should be around 20MB, since "bidet_ssd" build in bidet_ssd.py. I am confused whether I followed the incorrect way to run your code or missed any parameter setting.
In the test part, I got TypeError as follow:
Hoping for your response, thanks.
In the paper, it is described that \beta and \gamma were set to 10 and 0.2. However, the given code uses 0 as the default values for both of them, essentially meaning that the code is going to be ran without the proposed additional loss terms. I would appreciate it if the authors could provide some clarification on how \beta and \gamma were actually used during training.
安装voc的格式准备的数据集跑报错了,我不知道是什么原因,如果您方便的话麻烦您帮助我一下
PS D:\project\vscode\BiDet> python .\ssd\train_bidet_ssd.py --data_root='D:\project\vscode\BiDet\data\ea89447d' --basenet='D:\project\vscode\BiDet\ssd\pretrain\vgg16.pth'
Loading base network...
Loading the dataset...
Training SSD on: VOC0712
Using the specified args:
Namespace(basenet='D:\project\vscode\BiDet\ssd\pretrain\vgg16.pth', batch_size=32, clip_grad=False, cuda=True, data_root='D:\project\vscode\BiDet\data\ea89447d', dataset='VOC', gamma=0.1, lr=0.001, momentum=0.9, nms_conf_threshold=0.03, num_workers=16, opt='Adam', prior_weight=0.0, reg_weight=0.0, resume=False, sigma=0.0, start_iter=0, weight_decay=0.0, weight_path=None)
C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py:474: UserWarning: This DataLoader will
create 16 worker processes in total. Our suggested max number of worker in current system is 6 (cpuset
is not taken into account), which is smaller than what this DataLoader is going to create. Please be
aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
Traceback (most recent call last):
File ".\ssd\train_bidet_ssd.py", line 418, in
train()
File ".\ssd\train_bidet_ssd.py", line 207, in train
images, targets = next(batch_iterator)
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 517, in next
data = self._next_data()
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data
data.reraise()
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch_utils.py", line 429, in reraise
raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\project\vscode\BiDet\ssd\data\voc0712.py", line 116, in getitem
im, gt, h, w = self.pull_item(index)
File "D:\project\vscode\BiDet\ssd\data\voc0712.py", line 131, in pull_item
target = self.target_transform(target, width, height)
File "D:\project\vscode\BiDet\ssd\data\voc0712.py", line 73, in call
label_idx = self.class_to_ind[name]
KeyError: 'warning'
When you pre-train the vgg on ImageNet, Do you add BN or shortcut(SC)?
If you don't add it, when you load the pretrained model on BiDet, Will the parameters not match?
Hello! Could you provide a trained SSD model with Bi-det strategy? I need to use it for comparative analysis.
Although I use the tricks proposed in existed issues, NaN is always present during training.
Thank you very much!
Hello, I'm sorry to bother you. I met some problems in the training of BIDet's Fasters-RCNN, and I would like to consult you:
During training, in the first few epochs, there will be a loss NaN on some Iteration sections;
My training orders are: Python faster_rCNN /trainval_net.py --dataset='coco' --data_root='data/coco' --basenet='pretrain/resnet18.pth' -- mGPUs=True
I set RPN_PRIOR_WEIGHT, RPN_REG_WEIGHT, HEAD_PRIOR_WEIGHT, AND HEAD_PRIOR_WEIGHT as 0.2, 0.1, 0.2, 0.1 respectively from the [begining.]
Have you ever encountered this problem in the training process?
你好,我现在已经成功训练训练了模型,但是测试时出现了错误:Network is not defined. 我的训练命令是
python faster_rcnn/trainval_net.py --dataset='coco' --data_root='/home/user/Desktop/wam/BiDet/data' --basenet='/home/user/Desktop/wam/BiDet/faster_rcnn/pretrain/resnet18.pth'
测试命令是:
python test_net.py --dataset='coco' --checkpoint='./logs/coco/bidet18_IB/2021-07-06 20:46:25/model_50_loss_0.771_lr_1.0000000000000002e-06_rpn_cls_0.1611_rpn_bbox_0.0987_rcnn_cls_0.3137_rcnn_bbox_0.197_rpn_prior_0.0_rpn_reg_0.0005_head_prior_0.0_head_reg_0.0.pth'
您能给我一些建议吗,谢谢您。
Hi, could you tell me the size of your ssd or faster rcnn model? I found that my own trained faster rcnn model takes 142.14MB space! It is still too large.
According to your paper https://arxiv.org/pdf/2003.03961.pdf, input sized 6001000 is employed in binarized Faster R-CNN. However, i have read your code carefully, in model BiDetResNet, the first conv layer is 'nn.Conv2d(3, first_inplanes, kernel_size=7, stride=2, padding=3, bias=False)'. Take 6001000 input ,the FLOPs is 1,411,200,000=1345.8M. This FLOPs is calculated by torchstat. The FLOPs of the first layer single is greater than the FLOPs of whole binarized Faster R-CNN u claimed in ur paper as 781M. Please check this, thx.
您好,我在复现您的ssd模型在voc数据集上,遇到了这个错误,adjust_learning_rate(optimizer, args.gamma, step_index)
TypeError: adjust_learning_rate() takes 2 positional arguments but 3 were given
看您定义的def adjust_learning_rate(optimizer, new_lr)只需要两个函数。
是我哪里没弄明白吗?
您好,我将bdd100k转换为voc格式后进行训练,有将lr降到很低,但是训练的时候loss一直是inf,您能帮我分析一下吗?
Hi,
I download your trained model BiDet-SSD300-VOC_66.0.pth and want to load it with the network in bidet.ssd.py. But I met an issue of mismatching size as follows:
RuntimeError: Error(s) in loading state_dict for BiDetSSD:
size mismatch for conf.0.weight: copying a param with shape torch.Size([84, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([324, 512, 3, 3]).
size mismatch for conf.0.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([324]).
size mismatch for conf.1.weight: copying a param with shape torch.Size([126, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([486, 1024, 3, 3]).
size mismatch for conf.1.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([486]).
size mismatch for conf.2.weight: copying a param with shape torch.Size([126, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([486, 512, 3, 3]).
size mismatch for conf.2.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([486]).
size mismatch for conf.3.weight: copying a param with shape torch.Size([126, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([486, 256, 3, 3]).
size mismatch for conf.3.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([486]).
size mismatch for conf.4.weight: copying a param with shape torch.Size([84, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([324, 256, 3, 3]).
size mismatch for conf.4.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([324]).
size mismatch for conf.5.weight: copying a param with shape torch.Size([84, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([324, 256, 3, 3]).
size mismatch for conf.5.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([324]).
Can you advise how to fix it?
Thanks
我注意到原始的faster rcnn代码中,包含了FPN层,它在backbone和rpnhead之间,但您的代码中我似乎没有找到这一结构。
您使用了一些策略来替代FPN吗,还是作了怎样的修改呢,很感谢您可以为我解答。
您好,我在学习您的工作,我有一个疑问。
我在用thop计算ssd300的params与MFLOPs时,浮点运算数大致一样,但是params为26.28M,而不是文中所说的100.28M???
关于参数缩减,相比vgg、resnet-18,mobilenet的二值化减少的参数量很少,这是什么原因导致的呢?
Hello! Would you like to discuss three questions with me? Thank you.
Did you discover that BinaryConv for 1*1conv perform poorly in the detection task?
Are binary 11 conv or binary 33 the same for object task?
The Bi-det method is suitable for Yolo_v3?
Hi, really appreciated for your excellent work.
Like many other open-source binary quantization repositories, I notice that you conduct BinarizeConv2d based on torch.nn.functional.conv2d. For training everything is good. But for inference, it seems that the absence of xnor-bitcount based convolution keeps this excellent work from extreme superiority.
Have you implemented this or have you intended to do so? Thanks very much.
How did you get the FLOPs about DoReFa-Net, 4694M? What is the calculating rule?
Dear,
I am trying to deploy these models for Xilinx FPGA using the FINN framework. Do you have in your knowledge any incompatibility? Currently, I am having many issues while converting to the ONNX format.
Thanks
When you do a comparison experiments, For example, Bi-Real method is used to binarize the network,.The SSD_vgg16 structure used in Bi-Real is same as the SSD_vgg16 used in BiDet(SC)? Structure changes include add BN, clip maxpool,and so on.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.