Coder Social home page Coder Social logo

reppoints's Introduction

RepPoints: Point Set Representation for Object Detection

By Ze Yang, Shaohui Liu, and Han Hu.

We provide code support and configuration files to reproduce the results in the paper for "RepPoints: Point Set Representation for Object Detection" on COCO object detection. Our code is based on mmdetection, which is a clean open-sourced project for benchmarking object detection methods.

Introduction

RepPoints, initially described in arXiv, is a new representation method for visual objects, on which visual understanding tasks are typically centered. Visual object representation, aiming at both geometric description and appearance feature extraction, is conventionally achieved by bounding box + RoIPool (RoIAlign). The bounding box representation is convenient to use; however, it provides only a rectangular localization of objects that lacks geometric precision and may consequently degrade feature quality. Our new representation, RepPoints, models objects by a point set instead of a bounding box, which learns to adaptively position themselves over an object in a manner that circumscribes the object’s spatial extent and enables semantically aligned feature extraction. This richer and more flexible representation maintains the convenience of bounding boxes while facilitating various visual understanding applications. This repo demonstrated the effectiveness of RepPoints for COCO object detection.

Another feature of this repo is the demonstration of an anchor-free detector, which can be as effective as state-of-the-art anchor-based detection methods. The anchor-free detector can utilize either bounding box or RepPoints as the basic object representation.

Learning RepPoints in Object Detection.

Usage

a. Clone the repo:

git clone --recursive https://github.com/microsoft/RepPoints

b. Download the COCO detection dataset, copy RepPoints src into mmdetection and install mmdetection.

sh ./init.sh

c. Run experiments with a speicific configuration file:

./mmdetection/tools/dist_train.sh ${path-to-cfg-file} ${num_gpu} --validate

We give one example here:

./mmdetection/tools/dist_train.sh ./configs/reppoints_moment_r101_fpn_2x_mt.py 8 --validate

Citing RepPoints

@inproceedings{yang2019reppoints,
  title={RepPoints: Point Set Representation for Object Detection},
  author={Yang, Ze and Liu, Shaohui and Hu, Han and Wang, Liwei and Lin, Stephen},
  booktitle={The IEEE International Conference on Computer Vision (ICCV)},
  month={Oct},
  year={2019}
}

Results and models

The results on COCO 2017val are shown in the table below.

Method Backbone Anchor convert func Lr schd box AP Download
BBox R-50-FPN single - 1x 36.3 model
BBox R-50-FPN none - 1x 37.3 model
RepPoints R-50-FPN none partial MinMax 1x 38.1 model
RepPoints R-50-FPN none MinMax 1x 38.2 model
RepPoints R-50-FPN none moment 1x 38.2 model
RepPoints R-50-FPN none moment 2x 38.6 model
RepPoints R-50-FPN none moment 2x (ms train) 40.8 model
RepPoints R-50-FPN none moment 2x (ms train&ms test) 42.2
RepPoints R-101-FPN none moment 2x 40.3 model
RepPoints R-101-FPN none moment 2x (ms train) 42.3 model
RepPoints R-101-FPN none moment 2x (ms train&ms test) 44.1
RepPoints R-101-FPN-DCN none moment 2x 43.0 model
RepPoints R-101-FPN-DCN none moment 2x (ms train) 44.8 model
RepPoints R-101-FPN-DCN none moment 2x (ms train&ms test) 46.4
RepPoints X-101-FPN-DCN none moment 2x 44.5 model
RepPoints X-101-FPN-DCN none moment 2x (ms train) 45.6 model
RepPoints X-101-FPN-DCN none moment 2x (ms train&ms test) 46.8

Notes:

  • R-xx, X-xx denote the ResNet and ResNeXt architectures, respectively.
  • DCN denotes replacing 3x3 conv with the 3x3 deformable convolution in c3-c5 stages of backbone.
  • none in the anchor column means 2-d center point (x,y) is used to represent the initial object hypothesis. single denotes one 4-d anchor box (x,y,w,h) with IoU based label assign criterion is adopted.
  • moment, partial MinMax, MinMax in the convert func column are three functions to convert a point set to a pseudo box.
  • ms denotes multi-scale training or multi-scale test.
  • Note the results here are slightly different from those reported in the paper, due to framework change. While the original paper uses an MXNet implementation, we re-implement the method in PyTorch based on mmdetection.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

reppoints's People

Contributors

ancientmooner avatar b1ueber2y avatar microsoft-github-policy-service[bot] avatar microsoftopensource avatar msftgits avatar yangze0930 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

reppoints's Issues

Train the network with my own data

I make annotation pickle file for my data according to the /mmdet/datasets/loader/custom.py and I tried to train the reppoints_moment_r101_fpn_2x with my own data(Kitti 2d object).

I trained the network during 150 epochs, but mAp is only 0.007~0.008.
Compared to the start, it is better, but a little bit..

Please help me to train the network.
Do I need to train the network much more?

2019-12-11 18:05:39,671 - INFO - Epoch [147][900/998] lr: 0.00001, eta: 5 days, 16:09:03, time: 0.679, data_time: 0.010, memory: 7286, loss_cls: 0.1545, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1563
2019-12-11 18:08:40,224 - INFO - Epoch [147][998/998] lr: 0.00001, mAP: 0.0082
2019-12-11 18:09:55,945 - INFO - Epoch [148][100/998] lr: 0.00001, eta: 5 days, 15:53:37, time: 0.757, data_time: 0.093, memory: 7286, loss_cls: 0.1540, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1558
2019-12-11 18:11:03,040 - INFO - Epoch [148][200/998] lr: 0.00001, eta: 5 days, 15:53:53, time: 0.671, data_time: 0.010, memory: 7286, loss_cls: 0.1537, loss_pts_init: 0.0009, loss_pts_refine: 0.0011, loss: 0.1557
2019-12-11 18:12:10,203 - INFO - Epoch [148][300/998] lr: 0.00001, eta: 5 days, 15:54:09, time: 0.672, data_time: 0.009, memory: 7286, loss_cls: 0.1468, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1485
2019-12-11 18:13:17,742 - INFO - Epoch [148][400/998] lr: 0.00001, eta: 5 days, 15:54:32, time: 0.675, data_time: 0.009, memory: 7286, loss_cls: 0.1490, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1508
2019-12-11 18:14:25,741 - INFO - Epoch [148][500/998] lr: 0.00001, eta: 5 days, 15:55:01, time: 0.680, data_time: 0.010, memory: 7286, loss_cls: 0.1499, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1517
2019-12-11 18:15:33,414 - INFO - Epoch [148][600/998] lr: 0.00001, eta: 5 days, 15:55:25, time: 0.677, data_time: 0.010, memory: 7286, loss_cls: 0.1475, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1493
2019-12-11 18:16:40,978 - INFO - Epoch [148][700/998] lr: 0.00001, eta: 5 days, 15:55:46, time: 0.676, data_time: 0.010, memory: 7286, loss_cls: 0.1515, loss_pts_init: 0.0006, loss_pts_refine: 0.0011, loss: 0.1532
2019-12-11 18:17:48,520 - INFO - Epoch [148][800/998] lr: 0.00001, eta: 5 days, 15:56:07, time: 0.675, data_time: 0.010, memory: 7286, loss_cls: 0.1466, loss_pts_init: 0.0014, loss_pts_refine: 0.0011, loss: 0.1491
2019-12-11 18:18:56,288 - INFO - Epoch [148][900/998] lr: 0.00001, eta: 5 days, 15:56:31, time: 0.678, data_time: 0.010, memory: 7286, loss_cls: 0.1343, loss_pts_init: 0.0008, loss_pts_refine: 0.0011, loss: 0.1362
2019-12-11 18:21:56,786 - INFO - Epoch [148][998/998] lr: 0.00001, mAP: 0.0077
2019-12-11 18:23:12,311 - INFO - Epoch [149][100/998] lr: 0.00001, eta: 5 days, 15:41:21, time: 0.755, data_time: 0.091, memory: 7286, loss_cls: 0.1478, loss_pts_init: 0.0012, loss_pts_refine: 0.0011, loss: 0.1501
2019-12-11 18:24:20,508 - INFO - Epoch [149][200/998] lr: 0.00001, eta: 5 days, 15:41:53, time: 0.682, data_time: 0.010, memory: 7286, loss_cls: 0.1495, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1513
2019-12-11 18:25:28,653 - INFO - Epoch [149][300/998] lr: 0.00001, eta: 5 days, 15:42:24, time: 0.681, data_time: 0.010, memory: 7286, loss_cls: 0.1521, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1539
2019-12-11 18:26:36,299 - INFO - Epoch [149][400/998] lr: 0.00001, eta: 5 days, 15:42:46, time: 0.676, data_time: 0.010, memory: 7286, loss_cls: 0.1555, loss_pts_init: 0.0006, loss_pts_refine: 0.0011, loss: 0.1572
2019-12-11 18:27:43,982 - INFO - Epoch [149][500/998] lr: 0.00001, eta: 5 days, 15:43:09, time: 0.677, data_time: 0.009, memory: 7286, loss_cls: 0.1511, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1529
2019-12-11 18:28:51,608 - INFO - Epoch [149][600/998] lr: 0.00001, eta: 5 days, 15:43:30, time: 0.676, data_time: 0.010, memory: 7286, loss_cls: 0.1487, loss_pts_init: 0.0030, loss_pts_refine: 0.0011, loss: 0.1528
2019-12-11 18:29:59,406 - INFO - Epoch [149][700/998] lr: 0.00001, eta: 5 days, 15:43:54, time: 0.678, data_time: 0.010, memory: 7286, loss_cls: 0.1422, loss_pts_init: 0.0009, loss_pts_refine: 0.0011, loss: 0.1442
2019-12-11 18:31:07,035 - INFO - Epoch [149][800/998] lr: 0.00001, eta: 5 days, 15:44:14, time: 0.676, data_time: 0.010, memory: 7286, loss_cls: 0.1467, loss_pts_init: 0.0006, loss_pts_refine: 0.0011, loss: 0.1484
2019-12-11 18:32:14,481 - INFO - Epoch [149][900/998] lr: 0.00001, eta: 5 days, 15:44:32, time: 0.674, data_time: 0.009, memory: 7286, loss_cls: 0.1394, loss_pts_init: 0.0007, loss_pts_refine: 0.0011, loss: 0.1412
2019-12-11 18:35:13,168 - INFO - Epoch [149][998/998] lr: 0.00001, mAP: 0.0081

There's a problem when i test

The model and loaded state dict do not match exactly

these keys have mismatched shape:
+------------------------------------+-----------------------------+-----------------------------+
| key | expected shape | loaded shape |
+------------------------------------+-----------------------------+-----------------------------+
| bbox_head.reppoints_cls_out.weight | torch.Size([80, 256, 1, 1]) | torch.Size([28, 256, 1, 1]) |
| bbox_head.reppoints_cls_out.bias | torch.Size([80]) | torch.Size([28]) |
+------------------------------------+-----------------------------+-----------------------------+

Thank you very much for help me!

Question about the offsets of rep-points

Sorry for creating a similar issue with the closed one #21
In the following line, it seems pts_out_init - dcn_base_offset are the offsets of rep-points.

dcn_offset = pts_out_init_grad_mul - dcn_base_offset

However, based on the this line, it seems pts_out_init are just the offsets of rep-points.

pts_coordinate_preds_init = self.offset_to_pts(center_list,

Really appreciate it if anyone could help me with this.

ImportError: libtorch_cpu.so: cannot open shared object file: No such file or directory

could you show your cuda and pytorch version? Because the mmdetection version this repo used is too old, and when I compiled mmdet successfully under my env, but it raised the aboved error:

sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GPU 0,1: GeForce RTX 2080 Ti
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.4.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

TorchVision: 0.5.0
OpenCV: 4.5.2
MMCV: 0.5.3
MMDetection: 1.1.0+unknown
MMDetection Compiler: GCC 7.5
MMDetection CUDA Compiler: 10.1

[code question] difference between pseudo bbox and final bbox

Hello,

I have a question about the code.

To make final boxes, the predicted points go through two steps, 1) offset_to_pts function and 2) points2bbox function.

pts_coordinate_preds_init = self.offset_to_pts(center_list,

bbox_pred_init = self.points2bbox(

On the other hand, for the pseudo box generation, the predicted points are directly passed to the points2bbox function and the generated boxes are used for matching.

for i_img, center in enumerate(center_list):

bbox_preds_init = self.points2bbox(

Why do they differ? shouldn't we use the same procedure (i.e., offset_to_pts and then points2bbox)?

get ValueError: too many values to unpack (expected 2) when run example command for training

File "./mmdetection/tools/train.py", line 108, in
main()
File "./mmdetection/tools/train.py", line 104, in main
logger=logger)
File "/ping/ping/RepPoints/mmdetection/mmdet/apis/train.py", line 58, in train_detector
_dist_train(model, dataset, cfg, validate=validate)
File "/ping/ping/RepPoints/mmdetection/mmdet/apis/train.py", line 143, in _dist_train
for ds in dataset
File "/ping/ping/RepPoints/mmdetection/mmdet/apis/train.py", line 143, in
for ds in dataset
File "/ping/ping/RepPoints/mmdetection/mmdet/datasets/loader/build_loader.py", line 28, in build_dataloader
world_size, rank)
File "/ping/ping/RepPoints/mmdetection/mmdet/datasets/loader/sampler.py", line 97, in init
_rank, _num_replicas = get_host_info()
ValueError: too many values to unpack (expected 2)

get this error when run example command for training

train

(open-mmlab) fsr@ServerE1:~/code/RepPoints-master$ ./mmdetection/tools/dist_train.sh ./configs/reppoints_moment_r50_fpn_1x.py 2 --validate


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


Traceback (most recent call last):
File "./mmdetection/tools/train.py", line 108, in
main()
File "./mmdetection/tools/train.py", line 84, in main
cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
File "/home/fsr/code/RepPoints-master/mmdetection/mmdet/models/builder.py", line 43, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/fsr/code/RepPoints-master/mmdetection/mmdet/models/builder.py", line 15, in build
return build_from_cfg(cfg, registry, default_args)
File "/home/fsr/code/RepPoints-master/mmdetection/mmdet/utils/registry.py", line 67, in build_from_cfg
obj_type, registry.name))
KeyError: 'None is not in the detector registry'
Traceback (most recent call last):
File "./mmdetection/tools/train.py", line 108, in
main()
File "./mmdetection/tools/train.py", line 84, in main
cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
File "/home/fsr/code/RepPoints-master/mmdetection/mmdet/models/builder.py", line 43, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/fsr/code/RepPoints-master/mmdetection/mmdet/models/builder.py", line 15, in build
return build_from_cfg(cfg, registry, default_args)
File "/home/fsr/code/RepPoints-master/mmdetection/mmdet/utils/registry.py", line 67, in build_from_cfg
obj_type, registry.name))
KeyError: 'None is not in the detector registry'
Killing subprocess 3816
Killing subprocess 3817
Traceback (most recent call last):
File "/home/laocheng/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/laocheng/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/laocheng/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/laocheng/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/laocheng/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/laocheng/anaconda3/envs/open-mmlab/bin/python', '-u', './mmdetection/tools/train.py', '--local_rank=1', './configs/reppoints_moment_r50_fpn_1x.py', '--launcher', 'pytorch', '--validate']' returned non-zero exit status 1.

Example of get the RepPoints output?

Hi:
I already trained reppoints_moment_r50_fpn_2x_mt in my dataset and get some results. But i am more curious how to get the reppoints as output, like example pictures in the paper, rather than just regular box output? or any clue where to modify in mmdetection? A inference example would be nice.
Thanks.

The map obtained using reppoints_moment_r50_fpn_1x.py training is 0.378

I currently use 2 sets of 2080TI to set imgs_per_gpu = 4, the others are unchanged, lr_config = dict (policy = 'step', warmup = 'linear', warmup_iters = 500, warmup_ratio = 1.0 / 3, step = [8, 11]), total_epochs = 12, the map is 0.378 after training.
I used a 2080TI, imgs_per_gpu = 5, and got Map = 0.372 in training.
There is still a big gap from the 38.2 given.what should I do?

About the function problem of converting point set to pseudo box

Hello! Through experiments, we will use the function method of moment to visualize the point set on some visual tasks and find that the distribution of the point set tends to cover the vicinity of the bounding box, while the actual points on the object are relatively few;
This phenomenon is intuitively inconsistent. Intuitively, the point set should be uniformly covered on the object. Is there a certain defect in the method of converting this point set into a bounding box?

Why did I get 37.5 AP with "reppoints_minmax_r50_fpn_1x.py" on coco dataset?

Thanks for sharing your work!
I used the following command to train the model on coco dataset.

./mmdetection/tools/dist_train.sh ./configs/reppoints_minmax_r50_fpn_1x.py 8 --validate

I have tried using 4 gpus and 8 gpus for model training, but all the results are about 37.5 AP. I wonder what I got wrong or missed?
Thanks very much!

The problem about the ground truth box for learning reppoints at the initial training stage

(1)The code at the 444th line in reppoints_head.py:
(*_, bbox_gt_list_init, candidate_list_init, bbox_weights_list_init,
num_total_pos_init, num_total_neg_init) = cls_reg_targets_init
here, what does the variable "bbox_gt_list_init" represent? It is equivalent to the ground truth boxes?

(2)The same question related to the code at the 477th line in reppoints_head.py.
Is the variable "bbox_gt_list_refine" equivalent to "bbox_gt_list_init"? Is bbox_gt_list_refine same to the ground truth boxes?

Look forward to your replay. Thx.

Training error (Segmentation Fault)

Hi,
I am trying to train the model using the example provided in the README.md but I am receiving segmentation fault.

My system configuration are as follows:

gcc: 5.4.0
pytorch: 1.1.0
cuda: 10.0

I am trying to train it on 2 GPUs using:

./mmdetection/tools/dist_train.py ./configs/reppoints_moment_r101_fpn_2x_mt.py 2 --validate

Are there any limitations because of the above configuration?

Thank you very much

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).

I'm getting ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). when trying to launch this after following the steps on your README.

My setup is this docker image ran with docker run --gpus all -it --ipc=host nvidia/cuda:10.1-cudnn7-devel /bin/bash. Host operating system is Ubuntu 18.04. 128 GB of RAM and an NVIDIA RTX 2080 Ti.

Any thoughts?

Edit: This is the command: ./mmdetection/tools/dist_train.sh ./configs/reppoints_moment_r101_fpn_2x_mt.py 1 --validate

Question on the details

Hi, it is quite a nice implementation, I've read it and I got stuck in some points.

  1. In reppoints_head.py, what is the usage of adding the "dcn_base_offset" at line 278, does it play a row of enlarging the offset range?
  2. In the same file, i'm not sure if I am correct, I think the "pts_coordinate_preds_init" at line 424 after being applied the min_max is identical to "box_list" at line 466, except for the order. They both represent the bbox of the init points, why is it necessary to calculate the "box_list" again?

Thank you so much!

it get errors when i test

File "/home/nfs/admin0/chenguang/project/RepPoints/mmdetection/build/lib.linux-x86_64-3.7/mmdet/ops/dcn/deform_conv.py", line 55, in forward
cur_im2col_step)
RuntimeError: input image is smaller than kernel (shape_check at mmdet/ops/dcn/src/deform_conv_cuda.cpp:127)

deform_conv_cuda.cpp:127
AT_CHECK((inputHeight >= kH && inputWidth >= kW),
"input image is smaller than kernel");

Thanks for help me !!!!

Several questions about reppoints

Hello,

Thanks for your great work!
I would like to use this detector in my project.

I have several questions.

  1. Can the reppoints detector be adopted for class-agnostic detector (e.g. RPN)?
  • I I actually quickly have tried this. However, it shows unstable training (especially high reg and cls losses for refined reppoints). Any suggestions to solve this problem? Naively reducing the learning rate does not help :(
  1. Why you adjust the gradient of initial point during trainng? (i.e., multiplying 0.1 for init points)
  • When I set to 1 (i.e., fully taking the gradient from subsequent stage losses), it shows degraded scores. Why does this happen?
  1. For the 8 gpu training, how should I set the learning rate? 0.005? or 0.0025?

Thanks,

Code Realization Question about Offset Computation

In the reppoints_head.py line 278.
dcn_offset = pts_out_init_grad_mul - dcn_base_offset
is there any reason that u use the prediction offset minus the base grid?
somehow, plus operation make sense, what is that mean if u use minus?

Dimension difference between proposal and gt_bboxes

I am trying to use Random Sampler and Cross Entropy Loss. But I am receiving a runtime error for sampling the points.

/research/byu2/mudit7/FYP/RepPoints/mmdetection/mmdet/core/bbox/assign_sampling.py in assign_and_sample(bboxes, gt_bboxes, gt_bboxes_ignore, gt_labels, cfg)
     30                                          gt_labels)
     31     sampling_result = bbox_sampler.sample(assign_result, bboxes, gt_bboxes,
---> 32                                           gt_labels)
     33     return assign_result, sampling_result

/research/byu2/mudit7/FYP/RepPoints/mmdetection/mmdet/core/bbox/samplers/base_sampler.py in sample(self, assign_result, bboxes, gt_bboxes, gt_labels, **kwargs)
     53         gt_flags = bboxes.new_zeros((bboxes.shape[0], ), dtype=torch.uint8)
     54         if self.add_gt_as_proposals:
---> 55             bboxes = torch.cat([gt_bboxes, bboxes], dim=0)
     56             assign_result.add_gt_(gt_labels)
     57             gt_ones = bboxes.new_ones(gt_bboxes.shape[0], dtype=torch.uint8)

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 4 in dimension 1 at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/THC/generic/THCTensorMath.cu:71

I tried to check the dimension size of the proposal and gt_bboxes.

proposals.size()
In [1]: Out[1]: torch.Size([21875, 3])

In [1]: Out[1]: torch.Size([19197, 3])

gt_bboxes.size()
In [2]: Out[2]: torch.Size([19, 4])
In [2]: Out[2]: torch.Size([16, 4])

Why is there a difference in dimension?

Is the supervision of losses_pts_refine necessary?

I remove the losses_pts_refine in the loss_dict_all, and want to supervise the bbox regression only by the losses_pts_init, but find that in the training process, loss go to NaN after several hundread iters.
Is the loss_pts_refine necessary?

Why you assign the gt boxes to point sets by knn instead of by center?

Nice job. I'm reading the code for more details and I find it is mentioned in the paper that" the projection of this ground-truth object’s center point locate within this feature map bin" . However in this line, it said that a point set should be attached with a gt box which is its nearest gt box and the point set is in this gt box's k nearest neighbors. Doesn't this mean any prediction around the center point of a gt box and in its k nearest neighbors will be positive?

2. A point is assigned to some gt bbox if

Sincerely thanks if you cloud reply to me.

original image in paper

Can you provide the original image of skate boy for download?

So we can visualize the fine-grained localization.

ImportError: cannot import name 'get_dist_info'

As far as the adaptability is concerned, I cloned the mmdetection @ 6e48d28 to facilitate the Point Set Representation for Object Detection. Paradoxically, I encounter an undeniable error which is "ImportError: cannot import name 'get_dist_info". According to the mmdetection, this issue could be solved by updating the last version of mmdetection. Since I have to use the exact version that is suitable for RepPoints git, how I could tackle this issues ?
image

ImportError

ImportError: /RepPoints/mmdetection/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c1011CPUTensorIdEv

Can you help me with it?

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous

I followed all the instructions and get:

loading annotations into memory... loading annotations into memory... Done (t=0.43s) creating index... index created! Done (t=0.43s) creating index... index created! Traceback (most recent call last): File "./mmdetection/tools/train.py", line 108, in <module> main() File "./mmdetection/tools/train.py", line 104, in main logger=logger) File "/root/sharedfolder/production/2020/det/RepPoints/mmdetection/mmdet/apis/train.py", line 58, in train_detector _dist_train(model, dataset, cfg, validate=validate) File "/root/sharedfolder/production/2020/det/RepPoints/mmdetection/mmdet/apis/train.py", line 186, in _dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/opt/conda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 358, in run epoch_runner(data_loaders[i], **kwargs) File "/opt/conda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 271, in train self.call_hook('after_train_iter') File "/opt/conda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 229, in call_hook getattr(hook, fn_name)(self) File "/root/sharedfolder/production/2020/det/RepPoints/mmdetection/mmdet/core/utils/dist_utils.py", line 53, in after_train_iter runner.outputs['loss'].backward() File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 198, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

train error

when I run,python ./mmdetection/tools/train.py ./configs/reppoints_moment_x101_dcn_fpn_2x_mt.py --gpus 0
TypeError: init() got an unexpected keyword argument 'multiscale_mode'

Question About moment_transfer

Thank you for sharing the code of your interesting work! I have questions about the implementation details.

moment_transfer = (self.moment_transfer * self.moment_mul) + (
self.moment_transfer.detach() * (1 - self.moment_mul))

Is the above operation to control the learning rate by moment_mul?
Thanks very much!

Selection strategy of hyperparameter: pos_num?

Hi, I have some questions about the hyper parameter: pos_num. In paper, you said "2)The projection of this ground-truth object’s center point locate within this feature map bin." But in practice, a torch.topk() function is used to get the positive target location, which is assigned by pos_num=1. And all the configs' pos_num are set to 1. Here comes my questions:
1)Is this configuration similar with that written in your paper?
2)Why only one point is assigned?
3)Did you do ablation experiments about the selection of it?
Thanks for answering!

Detection speed

Thanks for releasing your source code. Your approach may be the next breakthrough in object detection and keypoint-detection.
Have you made any measurements for the inference speed of your models? Can it run in realtime?
Thanks

/mmdetection/tools/dist_train.py

when i use :./mmdetection/tools/dist_train.py ./configs/reppoints_moment_r101_fpn_2x_mt.py 8 --validate.
but it dont have dist_train.py in my path.
i want to get your hepl!

about the gt

how to get the gt of point, does it need mark?

Test Problem

I copy RepPoints src and configs into mmdetection,
and use

python tools/test.py configs/reppoints/reppoints_moment_r101_fpn_2x.py weights/reppoints_moment_r101_fpn_2x.pth --eval bbox --json_out a.json

to test the model in the mmdetection dir.
However, I get an error.

loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
[>>>>>>>>>>>>>>>>>> ] 7311/20288, 9.2 task/s, elapsed: 796s, ETA: 1412sTraceback (most recent call last):
File "tools/test.py", line 239, in
main()
File "tools/test.py", line 199, in main
outputs = single_gpu_test(model, data_loader, args.show)
File "tools/test.py", line 32, in single_gpu_test
result, det_time, nms_time = model(return_loss=False, rescale=not show, **data)
File "/home/shengtao/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/shengtao/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/shengtao/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/shengtao/models/RepPoints/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/home/shengtao/models/RepPoints/mmdet/models/detectors/base.py", line 88, in forward
return self.forward_test(img, img_meta, **kwargs)
File "/home/shengtao/models/RepPoints/mmdet/models/detectors/base.py", line 79, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/home/shengtao/models/RepPoints/mmdet/models/detectors/single_stage.py", line 66, in simple_test
outs = self.bbox_head(x)
File "/home/shengtao/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/shengtao/models/RepPoints/mmdet/models/anchor_heads/reppoints_head.py", line 291, in forward
return multi_apply(self.forward_single, feats)
File "/home/shengtao/models/RepPoints/mmdet/core/utils/misc.py", line 24, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/home/shengtao/models/RepPoints/mmdet/models/anchor_heads/reppoints_head.py", line 280, in forward_single
self.relu(self.reppoints_cls_conv(cls_feat, dcn_offset)))
File "/home/shengtao/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(input, kwargs)
File "/home/shengtao/models/RepPoints/mmdet/ops/dcn/deform_conv.py", line 236, in forward
self.dilation, self.groups, self.deformable_groups)
File "/home/shengtao/models/RepPoints/mmdet/ops/dcn/deform_conv.py", line 55, in forward
cur_im2col_step)
RuntimeError: input image is smaller than kernel (shape_check at mmdet/ops/dcn/src/deform_conv_cuda.cpp:127)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f3f5d088dc5 in /home/shengtao/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: shape_check(at::Tensor, at::Tensor, at::Tensor
, at::Tensor, int, int, int, int, int, int, int, int, int, int) + 0x720 (0x7f3f0e5686a0 in /home/shengtao/models/RepPoints/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #2: deform_conv_forward_cuda(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, int, int, int, int, int, int, int, int, int, int) + 0xf5 (0x7f3f0e569325 in /home/shengtao/models/RepPoints/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #3: + 0x233f0 (0x7f3f0e57d3f0 in /home/shengtao/models/RepPoints/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #4: + 0x2354e (0x7f3f0e57d54e in /home/shengtao/models/RepPoints/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #5: + 0x1ff91 (0x7f3f0e579f91 in /home/shengtao/models/RepPoints/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #11: THPFunction_apply(_object
, _object
) + 0x691 (0x7f3f830d8891 in /home/shengtao/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

Who can help me ?

What's more, I find the weight reppoints_moment_r101_dcn_fpn_2x.pth has no params for dcn part.

The code

python tools/test.py configs/reppoints/reppoints_moment_r101_dcn_fpn_2x.py weights/reppoints_moment_r101_dcn_fpn_2x.pth --eval bbox --json_out a.json

gets warnings

loading annotations into memory...
Done (t=0.44s)
creating index...
index created!
The model and loaded state dict do not match exactly
missing keys in source state_dict: backbone.layer3.8.conv2_offset.bias, backbone.layer3.11.conv2_offset.weight, backbone.layer3.5.conv2_offset.weight, backbone.layer3.22.conv2_offset.weight, backbone.layer3.4.conv2_offset.weight, backbone.layer3.10.conv2_offset.weight, backbone.layer3.18.conv2_offset.weight, backbone.layer3.20.conv2_offset.weight, backbone.layer3.17.conv2_offset.weight, backbone.layer4.1.conv2_offset.weight, backbone.layer2.0.conv2_offset.weight, backbone.layer3.13.conv2_offset.weight, backbone.layer3.9.conv2_offset.weight, backbone.layer3.14.conv2_offset.bias, backbone.layer3.16.conv2_offset.bias, backbone.layer3.11.conv2_offset.bias, backbone.layer3.13.conv2_offset.bias, backbone.layer4.2.conv2_offset.weight, backbone.layer3.0.conv2_offset.weight, backbone.layer3.16.conv2_offset.weight, backbone.layer3.0.conv2_offset.bias, backbone.layer3.19.conv2_offset.bias, backbone.layer3.21.conv2_offset.bias, backbone.layer4.0.conv2_offset.weight, backbone.layer3.21.conv2_offset.weight, backbone.layer3.1.conv2_offset.bias, backbone.layer3.6.conv2_offset.weight, backbone.layer3.5.conv2_offset.bias, backbone.layer2.0.conv2_offset.bias, backbone.layer2.2.conv2_offset.bias, backbone.layer3.20.conv2_offset.bias, backbone.layer3.7.conv2_offset.weight, backbone.layer3.12.conv2_offset.bias, backbone.layer3.14.conv2_offset.weight, backbone.layer3.4.conv2_offset.bias, backbone.layer4.1.conv2_offset.bias, backbone.layer3.15.conv2_offset.bias, backbone.layer3.12.conv2_offset.weight, backbone.layer2.3.conv2_offset.weight, backbone.layer3.10.conv2_offset.bias, backbone.layer4.0.conv2_offset.bias, backbone.layer3.7.conv2_offset.bias, backbone.layer3.19.conv2_offset.weight, backbone.layer3.3.conv2_offset.bias, backbone.layer3.8.conv2_offset.weight, backbone.layer2.3.conv2_offset.bias, backbone.layer3.22.conv2_offset.bias, backbone.layer3.9.conv2_offset.bias, backbone.layer2.1.conv2_offset.bias, backbone.layer2.2.conv2_offset.weight, backbone.layer3.17.conv2_offset.bias, backbone.layer3.1.conv2_offset.weight, backbone.layer3.3.conv2_offset.weight, backbone.layer2.1.conv2_offset.weight, backbone.layer3.18.conv2_offset.bias, backbone.layer3.6.conv2_offset.bias, backbone.layer3.2.conv2_offset.bias, backbone.layer3.15.conv2_offset.weight, backbone.layer4.2.conv2_offset.bias, backbone.layer3.2.conv2_offset.weight

but the code

python tools/test.py configs/reppoints/reppoints_moment_r101_fpn_2x.py weights/reppoints_moment_r101_dcn_fpn_2x.pth --eval bbox --json_out a.json

gets nothing.

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 1.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.