Coder Social home page Coder Social logo

cbnetv2's Introduction

CBNet: A Composite Backbone Network Architecture for Object Detection

PWC PWC PWC PWC

By Tingting Liang*, Xiaojie Chu*, Yudong Liu*, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling.

This repo is the official implementation of CBNetV2. It is based on mmdetection and Swin Transformer for Object Detection.

Contact us with [email protected], [email protected], [email protected].

Introduction

CBNetV2 achieves strong single-model performance on COCO object detection (60.1 box AP and 52.3 mask AP on test-dev) without extra training data.

teaser

Partial Results and Models

More results and models can be found in model zoo

Faster R-CNN

Backbone Lr Schd box mAP (minival) #params FLOPs config log model
DB-ResNet50 1x 40.8 69M 284G config github github

Mask R-CNN

Backbone Lr Schd box mAP (minival) mask mAP (minival) #params FLOPs config log model
DB-Swin-T 3x 50.2 44.5 76M 357G config github github

Cascade Mask R-CNN (1600x1400)

Backbone Lr Schd box mAP (minival/test-dev) mask mAP (minival/test-dev) #params FLOPs config model
DB-Swin-S 3x 56.3/56.9 48.6/49.1 156M 1016G config github

Improved HTC (1600x1400)

We use ImageNet-22k pretrained checkpoints of Swin-B and Swin-L. Compared to regular HTC, our HTC uses 4conv1fc in bbox head.

Backbone Lr Schd box mAP (minival/test-dev) mask mAP (minival/test-dev) #params FLOPs config model
DB-Swin-B 20e 58.4/58.7 50.7/51.1 235M 1348G config github
DB-Swin-L 1x 59.1/59.4 51.0/51.6 453M 2162G config (test only) github
DB-Swin-L (TTA) 1x 59.6/60.1 51.8/52.3 453M - config (test only) github

TTA denotes test time augmentation.

Notes:

Usage

Installation

Please refer to get_started.md for installation and dataset preparation.

Inference

# single-gpu testing (w/o segm result)
python tools/test.py <CONFIG_FILE> <DET_CHECKPOINT_FILE> --eval bbox 

# multi-gpu testing (w/ segm result)
tools/dist_test.sh <CONFIG_FILE> <DET_CHECKPOINT_FILE> <GPU_NUM> --eval bbox segm

Training

To train a detector with pre-trained models, run:

# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> 

For example, to train a Faster R-CNN model with a Duel-ResNet50 backbone and 8 gpus, run:

# path of pre-training model (resnet50) is already in config
tools/dist_train.sh configs/cbnet/faster_rcnn_cbv2d1_r50_fpn_1x_coco.py 8 

Another example, to train a Mask R-CNN model with a Duel-Swin-T backbone and 8 gpus, run:

tools/dist_train.sh configs/cbnet/mask_rcnn_cbv2_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py 8 --cfg-options model.pretrained=<PRETRAIN_MODEL> 

Apex (optional):

Following Swin Transformer for Object Detection, we use apex for mixed precision training by default. To install apex, run:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Documents and Tutorials

We list some documents and tutorials from MMDetection, which may be helpful to you.

Citation

If you use our code/model, please consider to cite our paper CBNet: A Composite Backbone Network Architecture for Object Detection.

@ARTICLE{9932281,
  author={Liang, Tingting and Chu, Xiaojie and Liu, Yudong and Wang, Yongtao and Tang, Zhi and Chu, Wei and Chen, Jingdong and Ling, Haibin},
  journal={IEEE Transactions on Image Processing}, 
  title={CBNet: A Composite Backbone Network Architecture for Object Detection}, 
  year={2022},
  volume={31},
  pages={6893-6906},
  doi={10.1109/TIP.2022.3216771}}

License

The project is only free for academic research purposes, but needs authorization for commerce. For commerce permission, please contact [email protected].

Other Links

Original CBNet: See CBNet: A Novel Composite Backbone Network Architecture for Object Detection.

cbnetv2's People

Contributors

aemikachow avatar aronlin avatar chrisfsj2051 avatar daavoo avatar erotemic avatar hellock avatar hhaandroid avatar impiga avatar innerlee avatar johnson-wang avatar jshilong avatar melikovk avatar mxbonn avatar myownskyw7 avatar oceanpang avatar rangilyu avatar runningleon avatar ryanxli avatar shinya7y avatar thangvubk avatar tianyuandu avatar tingtingliangvs avatar v-qjqs avatar wangruohui avatar wswday avatar xvjiarui avatar yeliudev avatar yhcao6 avatar yuzhj avatar zwwwayne avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.