chenjoya / sampling-free Goto Github PK

View Code? Open in Web Editor NEW

294.0 11.0 25.0 4.32 MB

IEEE TIP: Is heuristic sampling necessary in training deep object detectors? Try sampling-free object detectors!

Python 71.39% C++ 3.45% Cuda 22.91% Shell 0.17% C 2.07%

sampling-free's Introduction

Sampling-Free for Object Detection

Development, Maintenance @ChenJoya. Please feel free to contact me: [email protected]

Introduction

To address the foreground-background imbalance, is heuristic sampling necessary in training deep object detectors?

Keep clam and try the sampling-free mechanism in this repository.

Sampling-free mechanism enables various object detectors (e.g. one-stage, two-stage, anchor-free, multi-stage) to drop sampling heuristics (e.g., undersampling, Focal Loss, objectness), but achieve better bounding-box or instance segmentation accuracy.

Technical report: https://arxiv.org/abs/1909.04868. This repository is based on maskrcnn-benchmark, including the implementation of RetinaNet/FCOS/Faster/Mask R-CNN. Other detectors will also be released.

Installation

Check INSTALL.md for installation instructions.

Training

See scripts/train.sh, you can easily train with the sampling-free mechanism.

Evaluation

See scripts/eval.sh, you can easily evaluate your trained model.

COCO dataset

Model	Config	Box AP (minival)	Mask AP (minival)
RetinaNet	retinanet_R_50_FPN_1x	36.4	--
RetinaNet - Focal Loss + Sampling-Free	retinanet_R_50_FPN_1x	36.8	--
FCOS	fcos_R_50_FPN_1x	37.1	--
FCOS - Focal Loss + Sampling-Free	fcos_R_50_FPN_1x	37.6	--
Faster R-CNN	faster_rcnn_R_50_FPN_1x	36.8	--
Faster R-CNN -Biased Sampling + Sampling-Free	faster_rcnn_R_50_FPN_1x	38.4	--
Mask R-CNN	mask_rcnn_R_50_FPN_1x	37.8	34.2
Mask R-CNN - Biased Sampling + Sampling-Free	mask_rcnn_R_50_FPN_1x	39.0	34.9
PAA	paa_R_50_FPN_1x	40.4	--
PAA - Focal Loss + Sampling-Free	paa_R_50_FPN_1x	41.0	--

PASCAL VOC dataset (07+12 for training)

Model	Config	mAP (07test)
RetinaNet	retinanet_voc_R_50_FPN_0.2x	79.3
RetinaNet - Focal Loss + Sampling-Free	retinanet_voc_R_50_FPN_0.2x	80.1
Faster R-CNN	faster_rcnn_voc_R_50_FPN_0.2x	80.9
Faster R-CNN - Biased Sampling + Sampling-Free	faster_rcnn_voc_R_50_FPN_0.2x	81.5

Other Details

See the original benchmark maskrcnn-benchmark for more details.

Citations

Please consider citing this project in your publications if it helps your research. The following is a BibTeX reference. The BibTeX entry requires the url LaTeX package.

@article{sampling_free,
author    = {Joya Chen and
             Dong Liu and
             Tong Xu and
             Shiwei Wu and
             Yifei Cheng and
             Enhong Chen},
title     = {Is Heuristic Sampling Necessary in Training Deep Object Detectors?},
journal   = {IEEE Transactions on Image Processing},
year      = {2021},
volume    = {},
number    = {},
pages     = {1-1},
}

License

sampling-free is released under the MIT license. See LICENSE for additional details.

sampling-free's People

Contributors

Stargazers

Watchers

sampling-free's Issues

Clarification Request

Thank you for Sampling-free code.

I have started exploring the code and try to execute for MaskRCNN on the COCO dataset.

I have few doubts. Could you please clarify my doubts?

What is the meaning for this below configuration found on ./scripts/train.py. What is this number 16000?
fpn_pos_nms_top_n_train=$[16000/$gpun]
if i try to run on 2 gpu, two process ids are created instead of one processId, why is this happening?
How to run the tensorboard to visualize the learning performance in the graph?

Thank you,

When will yolov3 based implementation release?

Where are the code about sampling-free?

❓ Questions and Help

Could you point out where are the code about your sampling-free function?
Because I am not familiar with maskrcnn_benchmark, Just want to use sampling-free function in other networks.

Thank you very much.

yolov3 not in model zoo?

❓ Questions and Help

Why remove fcos and add foveabox ？

❓ Questions and Help

single GPU

Thank you for Sampling-free code，can you train on a single GPU？

Confusion about the custom-implemented CE

Hi,

The work is really interesting and provides me new thought in my model compression work. However, I felt somesort confused about the cross entrpy developed in the repo.

As far as I could understand, for the cls branch, Focal loss is leveraged in original RetinaNet / FCOS, which outputs multi-label classification score. The categories are no mutually exclusive (sum of the possibility is not 1). In respect to the developed CE, it is still multi-label classification loss function.

Might I ask

Is it possible to employ nn.CROSSENTROPYLOSS for the classification branch instead? Namely multi-category vs multi-label. Indeed, I saw someone mentioned that objects subject to different categories might overlap in the same positision. It is more suitable to use multi-label rather than multi-category loss. But I observed that most objects are not so crowed (or even it is crowed, the overlapped objects generally belong to the same category).
Is it equivalent by setting gamma to be zero in Focal loss compared to the develpoped CE module? If so, it might be not necessary to re-implement the CE module?

The result of Pascal VOC seems bad.

❓ Questions and Help

Hi Joya,
Currently, I am training retinanet voc 0.2x as the method you provide in the README. Since the pascal voc data is in COCO format, the default evaluation method is the coco evaluation method. The results are:
Accumulating evaluation results... DONE (t=6.31s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.457 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.722 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.482 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.125 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.321 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.540 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.398 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.582 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.600 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.255 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.510 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.671 2020-04-01 17:57:02,155 maskrcnn_benchmark.inference INFO: Task: bbox AP, AP50, AP75, APs, APm, APl 0.4572, 0.7220, 0.4823, 0.1246, 0.3206, 0.5403
Since the voc result is AP50 in COCO evaluation style, it is only 0.722 here. I am wondering how to get the result you provide. Or in other words, am I possible to use pascal voc evaluation tool in this project or the maskrcnn-benchmark? Maybe that will provide the 79+result. I'm not sure, but how to get the result in your README?
Thanks!

for two-stage detector, bias init only apply in rpn?

❓ Questions and Help

for two-stage detector, bias init only apply in rpn while bbox head not need?

chenjoya / sampling-free Goto Github PK

sampling-free's Introduction

Sampling-Free for Object Detection

Introduction

Installation

Training

Evaluation

COCO dataset

PASCAL VOC dataset (07+12 for training)

Other Details

Citations

License

sampling-free's People

Contributors

Stargazers

Watchers

Forkers

sampling-free's Issues

❓ Questions and Help

❓ Questions and Help

❓ Questions and Help

❓ Questions and Help

❓ Questions and Help

Recommend Projects

Recommend Topics

Recommend Org