uoip / ssd-variants Goto Github PK

PyTorch implementation of several SSD based object detection algorithms.

License: MIT License

Python 100.00%

ssd object-detection convolutional-networks deep-learning pytorch one-stage

ssd-variants's Introduction

This is a learning project trying to implement some varants of SSD in pytorch. SSD is a one-stage object detector, probably "currently the best detector with respect to the speed-vs-accuracy trade-off". There are many follow-up papers that either further improve the detection accuracy, or incorporate techniques like image segmentation to be used for Scene Understanding(e.g. BlitzNet), or modify SSD to detect rotatable objects(e.g. DRBox), or apply SSD to 3d object detection(e.g. Frustum PointNets):

SSD - "SSD: Single Shot MultiBox Detector" (2016) arXiv:1512.02325 , github
DSSD - "DSSD : Deconvolutional Single Shot Detector" (2017) arXiv:1701.06659
RRC - "Accurate Single Stage Detector Using Recurrent Rolling Convolution" (2017) arXiv:1704.05776 , github
RUN - "Residual Features and Unified Prediction Network for Single Stage Detection" (2017) arXiv:1707.05031
DSOD - "DSOD: Learning Deeply Supervised Object Detectors from Scratch" (2017) arXiv:1708.01241 , github
BlitzNet - "BlitzNet: A Real-Time Deep Network for Scene Understanding" (2017) arXiv:1708.02813 , github
RefineDet - "Single-Shot Refinement Neural Network for Object Detection" (2017) arXiv:1711.06897 , github
DRBox - "Learning a Rotation Invariant Detector with Rotatable Bounding Box" (2017) arXiv:1711.09405 , github
Frustum PointNets - "Frustum PointNets for 3D Object Detection from RGB-D Data" (2017) arXiv:1711.08488

Overview

Model	publish time	Backbone	input size	Boxes	FPS	VOC07	VOC12	COCO
SSD300	2016	VGG-16	300 × 300	8732	46	77.2	75.9	25.1
SSD512	2016	VGG-16	512 × 512	24564	19	79.8	78.5	28.8
SSD321	2017.01	ResNet-101	321 × 321	17080	11.2	77.1	75.4	28.0
SSD513	2017.01	ResNet-101	513 × 513	43688	6.8	80.6	79.4	31.2
DSSD321	2017.01	ResNet-101	321 × 321	17080	9.5	78.6	76.3	28.0
DSSD513	2017.01	ResNet-101	513 × 513	43688	5.5	81.5	80.0	33.2
RUN300	2017.07	VGG-16	300 × 300	11640	64 (Pascal)	79.1	77.0
DSOD300	2017.08	DS/64-192-48-1	300 × 300		17.4	77.7	76.3	29.3
BlitzNet300	2017.08	ResNet-50	300 × 300	45390	24	78.5	75.4	29.7
BlitzNet512	2017.08	ResNet-50	512 × 512	32766	19.5	80.7	79.0	34.1
RefineDet320	2017.11	VGG-16	320 × 320	6375	40.3	80.0	78.1	29.4
RefineDet512	2017.11	VGG-16	512 × 512	16320	24.1	81.8	80.0	33.0
RefineDet320	2017.11	ResNet-101	320 × 320					32.0
RefineDet512	2017.11	ResNet-101	512 × 512					36.4

RRC	2017.04	VGG-16	1272 × 375
DRBox	2017.11	VGG-16	300 × 300
Frustum PointNets rgb part	2017.11	VGG-16	1280 × 384

FPS: # of processed images per second on Titan X GPU (batch size is 1)
VOC07: PASCAL 2007 detection results(mAP), training data: 07+12(07 trainval + 12 trainval)
VOC12: PASCAL 2012 detection results(mAP), training data: 07++12(07 trainval + 07 test + 12 trainval)
COCO: MS COCO 2015 test-dev detection results(mAP@[0.5:0.95]), train on trainval35k

All backbone networks above have been pre-trained on ImageNet CLS-LOC dataset, except DSOD, it's "training from scratch".

Implemented

Note: "Implemented" above means the code of the model is almost done, it doesn't mean I have trained it, or even reproduced the results of original paper. Actually, I have only trained SSD300 on VOC07, the best result I got is 76.5%, lower than 77.2% reported in SSD paper. I'll continue this project when I find out what's the problem.

Requirements

Python 3.6+
numpy
cv2
pytorch
tensorboardX

Dataset

Download dataset VOC2007 and VOC2012, put them under VOCdevkit directory:

VOCdevkit
-| VOC2007
   -| Annotations
   -| ImageSets
   -| JPEGImages
   -| SegmentationClass
   -| SegmentationObject
-| VOC2012
   -| Annotations
   -| ImageSets
   -| JPEGImages
   -| SegmentationClass
   -| SegmentationObject

Usage

train:

python train.py --cuda --voc_root path/to/your/VOCdevkit --backbone path/to/your/vgg16_reducedfc.pth
The backbone network vgg16_reducedfc.pth is from repo amdegroot/ssd.pytorch (download link: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth).

evaluate:

python train.py --cuda --test --voc_root path/to/your/VOCdevkit --checkpoint path/to/your/xxx.pth

show demo:

python train.py --cuda --demo --voc_root path/to/your/VOCdevkit --checkpoint path/to/your/xxx.pth

Results

VOC07 mAP

models	my result	paper result
SSD300	76.5%	77.2%

to be continued

Reference

ssd-variants's People

Contributors

Stargazers

Watchers

Forkers

wavelet303 achaiah hajungong007 hdjang bestlin leixuchn agoila liuguoyou shubhampachori12110095 grseb9s airyym snooble yzb0212 rickchen147258 amanmeetgarg baby47 seongkyun kindpire bityangke libo9562 gzhermit cxrasdfg peijinwang niluanwudidadi akhilesh-pandey bikong2 aihekukafeidexiaoafei collector-m satoshirobatofujimoto amirunpri2018 jiangxiluning vxltrxrsmxth connoisseures idrbraveheart madhavkhoslaa yogsin myclab naveenkumarmulabitmovin sgfxtrader dntai bdps1989 guobinli qgh1223 zlszhonglongshen andyqian2015 semihal trevol kitayama1234 wuzhan11 sui6662012 yujichai bobdeng1974 dannieldwt elijahahianyo bignerdguy2 gkuo06

ssd-variants's Issues

DSSD model?

please share the pretrained weights for all?

help me with to get the pre-trained weights, available for this analysis?

DSOD map accuracy

DId anyone manage to report the map=77.6% for DSOD model using this repo?

SSD implementation

We trained an SSD model using this repo and we report map =74% only. Any reason for that?

DSOD test

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

When I change the "SSD" to "DSOD", the error occured. Could you tell me how could I solve it?

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Hello.
I've tried to run this code with

python train.py --cuda --voc_root ~/data/VOCdevkit --batch_size 16 --backbone ./vgg16_reducedfc.pth

but it occurs errors below

argparser: Namespace(backbone='./vgg16_reducedfc.pth', batch_size=16, checkpoint='', cuda=True, demo=False, lr=0.001, seed=233, start_iter=0, test=False, threads=4, voc_root='/home/han/data/VOCdevkit') ./models/SSD.py:22: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_. nn.init.constant(self.weight, scale) ./models/SSD.py:111: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_. nn.init.xavier_uniform(m.weight.data) Backbone loaded! /home/han/virtualenv/py36/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead. warnings.warn(warning.format(ret)) Traceback (most recent call last): File "train.py", line 268, in <module> train() File "train.py", line 149, in train loss.backward() File "/home/han/virtualenv/py36/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/han/virtualenv/py36/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

So, I changed all the inplace=True options in SSD.py
For example in line 81,

from x = F.relu(m(x), inplace=True) to x = F.relu(m(x), inplace=False) .

But it occurs same error.

Is there any solution?

my environment is exactly same with required environment.
(python3.6 with other requirements)

Low Volatile GPU utility caused by Transform process which costs too much time

I have noted that this implementation bring some box assignments in ssd.pytorch from GPU to CPU. The dataset return preprocessed bboxes (bboxes.shape=[8732, 4]) instead of ground-truth boxes. In my case, transform progress in my costumed dataset is about 10 times slower than one in original pascal voc. That cause my GPUs keep starving for low CPU data processing.
I printed the transform time by time library in python3. Here are the contradistinction:
Both transform time calculated by following codes:
t0 = time.time()
if self.transform is not None:
img, bboxes = self.transform(img, bboxes)
t1 = time.time()
delta = t1 - t0
print('sec', delta%60)
In Pascal VOC:

In my dataset:

Of cause, I have already check other part's cost time in getitem method, they are pretty much closed, details will not be presented.
The only reason I can directly recall is the different image sizes between two datasets. VOC's image shape is around (375, 500, 3), (500, 334, 3) [cv2.imread, HWC], but my dataset images' size is like (1080, 1920, 3), (1078, 1916, 3), (1500, 2000, 3), which are much bigger than VOC's.
Is there any other possible reason? Transform is kind of too complex for me. Please help me accelerate my dataloader! THINKS!

Inplace Operations Error

Hey,

I am trying to run this with pytorch 1.0.0 and get the following error

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Any suggestions?

Thanks!

Any plan to extend it for coco dataset?

RefineDet model?

Hi,

Thanks so much for your sharing. I saw you have the RefineDet performance, however, in the models folder I did not see the model file. Could you kindly tell me how to access it?

Thanks.

blitznet always produces nans

Hi,

basically what the title says...
Could you provide a simple gist to train blitznet?

Thanks in advance

Encounter the RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

One RuntimeError occurred when I tried to train SSD on VOC dataset.
You may find more reference from Pytorch Forum:
https://discuss.pytorch.org/t/encounter-the-runtimeerror-one-of-the-variables-needed-for-gradient-computation-has-been-modified-by-an-inplace-operation/836
In my case, I changed the x variable in SSD.py and it worked for me. Hope helps for you!
Line 25 in model/SSD.py
x /= (x.pow(2).sum(dim=1, keepdim=True).sqrt() + 1e-10)
to
x /= (x.clone().pow(2).sum(dim=1, keepdim=True).sqrt() + 1e-10)
You may face same issue when implement other model like SSD512, you can debug by your self with detection feature in pytorch. Change codes in train.py:
Line 268 in train.py
train()
to
with torch.autograd.set_detect_anomaly(True):
train()
for more information:
pytorch/pytorch#15803

TypeError: new() got an unexpected keyword argument 'serialized_options'

running train.py as followed encounter the error
(pytorch_p27) [ec2-user@ip-172-31-37-28 SSD-variants]$ python train.py --demo --voc_root /VOCdevkit --checkpoint vgg16_reducedfc.pth

File "train.py", line 27, in
from tensorboardX import SummaryWriter
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/init.py", line 5, in
from .torchvis import TorchVis
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/torchvis.py", line 11, in
from .writer import SummaryWriter
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/writer.py", line 27, in
from .event_file_writer import EventFileWriter
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/event_file_writer.py", line 28, in
from .proto import event_pb2
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/proto/event_pb2.py", line 15, in
from tensorboardX.proto import summary_pb2 as tensorboardX_dot_proto_dot_summary__pb2
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/proto/summary_pb2.py", line 15, in
from tensorboardX.proto import tensor_pb2 as tensorboardX_dot_proto_dot_tensor__pb2
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/proto/tensor_pb2.py", line 15, in
from tensorboardX.proto import resource_handle_pb2 as tensorboardX_dot_proto_dot_resource__handle__pb2
File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/tensorboardX/proto/resource_handle_pb2.py", line 22, in
serialized_pb=_b('\n(tensorboardX/proto/resource_handle.proto\x12\x0ctensorboardX"r\n\x13ResourceHandleProto\x12\x0e\n\x06\x64\x65vice\x18\x01 \x01(\t\x12\x11\n\tcontainer\x18\x02 \x01(\t\x12\x0c\n\x04name\x18\x03 \x01(\t\x12\x11\n\thash_code\x18\x04 \x01(\x04\x12\x17\n\x0fmaybe_type_name\x18\x05 \x01(\tB/\n\x18org.tensorflow.frameworkB\x0eResourceHandleP\x01\xf8\x01\x01\x62\x06proto3')
TypeError: new() got an unexpected keyword argument 'serialized_options'

Is there anyone able to achieve close to paper-reported accuracy using this repo code?

Is there anyone able to achieve close to paper-reported accuracy using this repo code? Thanks!

How could I train the DSOD ?

When I train the DSOD, I did not load the reducefc.pth, and I replaced the "SSD" with the "DSOD", but something wrong occured."IndexError: invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number
" could you tell me what should I do to solve it?