Coder Social home page Coder Social logo

grp-dsod's Introduction

GRP-DSOD

We have released the GRP-DSOD code in https://github.com/szq0214/DSOD. Check out the pycaffe code there if you would like to reproduce the exact same results as in the paper.

In this repository, we are planning to release a pytorch version of DSOD and GRP-DSOD - stay tuned!

We also see some very promising results on the PASCAL VOC Comp3 Leaderboard, like https://github.com/kuangliu/torchcv. Unfortunately, they used the ImageNet pre-trained models as the initialized parameters (kuangliu/torchcv#11). Please note that the Comp3 Challenge only allows to use the VOC12 dataset for training (without the pre-trained models). Please check your training process carefully.

If you find this helps your research, please cite:

@article{shen2017learning,
     title={Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids},
     author={Shen, Zhiqiang and Shi, Honghui and Feris, Rogerio and Cao, Liangliang and Yan, Shuicheng and Liu, Ding and Wang, Xinchao and Xue, Xiangyang and Huang, Thomas S},
     journal={arXiv preprint arXiv:1712.00886},
     year={2017}
}

Introduction

In GRP-DSOD, we propose a recurrent feature-pyramid structure to squeeze rich spatial and semantic features into a single prediction layer that further reduces the number of parameters to learn (DSOD need learn 1/2, but GRP-DSOD need only 1/3). Thus our new model is more fit for learning from scratch, and can converge faster than DSOD. We also introduce a novel gate-controlled prediction strategy in GRP-DSOD to adaptively enhance or attenuate feature activations at different scales based on the input object size.

Figure 1: Illustration of the motivation of GRP-DSOD.
Figure 2: An overview of GRP-DSOD together with three one-stage detector methods.

Visualization

  1. Visualizations of network structures (tools from ethereon, please ignore the warning messages):

Results & Models

Our PASCAL VOC LMDB files:

Method LMDBs
Train on VOC07+12 and test on VOC07 Download
Train on VOC07++12 and test on VOC12 (Comp4) Download
Train on VOC12 and test on VOC12 (Comp3) Download

The tables below show the results on PASCAL VOC 2007, 2012 and 2012 Comp3 (training on VOC 2012 only).

PASCAL VOC test results:

Method VOC 2007 test mAP # params Models
GRP-DSOD300 (07+12) 78.5 14.1M Download (56.5M)
GRP-DSOD320 (07+12) 78.7 14.2M Download (56.8M)
GRP-DSOD320* (07+12) 79.0 16.0M Download (63.9M)
Method VOC 2012 test mAP # params Models
GRP-DSOD320* (12) 72.5 (VOC Comp3) 16.0M Download (63.9M)
GRP-DSOD320 (07++12) 77.0 14.2M Download (56.8M)
GRP-DSOD320* (07++12) -- -- Running

Contact

Zhiqiang Shen (zhiqiangshen0214 at gmail.com)

Any comments or suggestions are welcome!

grp-dsod's People

Contributors

szq0214 avatar

Watchers

James Cloos avatar DL avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.