Coder Social home page Coder Social logo

sampling-free's Introduction

Sampling-Free for Object Detection

Development, Maintenance @ChenJoya. Please feel free to contact me: [email protected]

Introduction

To address the foreground-background imbalance, is heuristic sampling necessary in training deep object detectors?

Keep clam and try the sampling-free mechanism in this repository.

Sampling-free mechanism enables various object detectors (e.g. one-stage, two-stage, anchor-free, multi-stage) to drop sampling heuristics (e.g., undersampling, Focal Loss, objectness), but achieve better bounding-box or instance segmentation accuracy.

Technical report: https://arxiv.org/abs/1909.04868. This repository is based on maskrcnn-benchmark, including the implementation of RetinaNet/FCOS/Faster/Mask R-CNN. Other detectors will also be released.

Installation

Check INSTALL.md for installation instructions.

Training

See scripts/train.sh, you can easily train with the sampling-free mechanism.

Evaluation

See scripts/eval.sh, you can easily evaluate your trained model.

COCO dataset

Model Config Box AP (minival) Mask AP (minival)
RetinaNet retinanet_R_50_FPN_1x 36.4 --
RetinaNet - Focal Loss + Sampling-Free retinanet_R_50_FPN_1x 36.8 --
FCOS fcos_R_50_FPN_1x 37.1 --
FCOS - Focal Loss + Sampling-Free fcos_R_50_FPN_1x 37.6 --
Faster R-CNN faster_rcnn_R_50_FPN_1x 36.8 --
Faster R-CNN -Biased Sampling + Sampling-Free faster_rcnn_R_50_FPN_1x 38.4 --
Mask R-CNN mask_rcnn_R_50_FPN_1x 37.8 34.2
Mask R-CNN - Biased Sampling + Sampling-Free mask_rcnn_R_50_FPN_1x 39.0 34.9
PAA paa_R_50_FPN_1x 40.4 --
PAA - Focal Loss + Sampling-Free paa_R_50_FPN_1x 41.0 --

PASCAL VOC dataset (07+12 for training)

Model Config mAP (07test)
RetinaNet retinanet_voc_R_50_FPN_0.2x 79.3
RetinaNet - Focal Loss + Sampling-Free retinanet_voc_R_50_FPN_0.2x 80.1
Faster R-CNN faster_rcnn_voc_R_50_FPN_0.2x 80.9
Faster R-CNN - Biased Sampling + Sampling-Free faster_rcnn_voc_R_50_FPN_0.2x 81.5

Other Details

See the original benchmark maskrcnn-benchmark for more details.

Citations

Please consider citing this project in your publications if it helps your research. The following is a BibTeX reference. The BibTeX entry requires the url LaTeX package.

@article{sampling_free,
author    = {Joya Chen and
             Dong Liu and
             Tong Xu and
             Shiwei Wu and
             Yifei Cheng and
             Enhong Chen},
title     = {Is Heuristic Sampling Necessary in Training Deep Object Detectors?},
journal   = {IEEE Transactions on Image Processing},
year      = {2021},
volume    = {},
number    = {},
pages     = {1-1},
}

License

sampling-free is released under the MIT license. See LICENSE for additional details.

sampling-free's People

Contributors

chenjoya avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sampling-free's Issues

Clarification Request

Thank you for Sampling-free code.

I have started exploring the code and try to execute for MaskRCNN on the COCO dataset.

I have few doubts. Could you please clarify my doubts?

  1. What is the meaning for this below configuration found on ./scripts/train.py. What is this number 16000?
    fpn_pos_nms_top_n_train=$[16000/$gpun]

  2. if i try to run on 2 gpu, two process ids are created instead of one processId, why is this happening?

  3. How to run the tensorboard to visualize the learning performance in the graph?

Thank you,

Where are the code about sampling-free?

❓ Questions and Help

Could you point out where are the code about your sampling-free function?
Because I am not familiar with maskrcnn_benchmark, Just want to use sampling-free function in other networks.

Thank you very much.

single GPU

Thank you for Sampling-free code,can you train on a single GPU?

Confusion about the custom-implemented CE

Hi,

The work is really interesting and provides me new thought in my model compression work. However, I felt somesort confused about the cross entrpy developed in the repo.

As far as I could understand, for the cls branch, Focal loss is leveraged in original RetinaNet / FCOS, which outputs multi-label classification score. The categories are no mutually exclusive (sum of the possibility is not 1). In respect to the developed CE, it is still multi-label classification loss function.

Might I ask

  1. Is it possible to employ nn.CROSSENTROPYLOSS for the classification branch instead? Namely multi-category vs multi-label. Indeed, I saw someone mentioned that objects subject to different categories might overlap in the same positision. It is more suitable to use multi-label rather than multi-category loss. But I observed that most objects are not so crowed (or even it is crowed, the overlapped objects generally belong to the same category).

  2. Is it equivalent by setting gamma to be zero in Focal loss compared to the develpoped CE module? If so, it might be not necessary to re-implement the CE module?

The result of Pascal VOC seems bad.

❓ Questions and Help

Hi Joya,
Currently, I am training retinanet voc 0.2x as the method you provide in the README. Since the pascal voc data is in COCO format, the default evaluation method is the coco evaluation method. The results are:
Accumulating evaluation results... DONE (t=6.31s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.457 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.722 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.482 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.125 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.321 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.540 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.398 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.582 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.600 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.255 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.510 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.671 2020-04-01 17:57:02,155 maskrcnn_benchmark.inference INFO: Task: bbox AP, AP50, AP75, APs, APm, APl 0.4572, 0.7220, 0.4823, 0.1246, 0.3206, 0.5403
Since the voc result is AP50 in COCO evaluation style, it is only 0.722 here. I am wondering how to get the result you provide. Or in other words, am I possible to use pascal voc evaluation tool in this project or the maskrcnn-benchmark? Maybe that will provide the 79+result. I'm not sure, but how to get the result in your README?
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.