peizesun / sparser-cnn Goto Github PK

View Code? Open in Web Editor NEW

1.3K 18.0 185.0 5.38 MB

[CVPR2021, PAMI2023] End-to-End Object Detection with Learnable Proposal

License: MIT License

Python 89.63% Shell 0.52% C++ 4.41% Cuda 5.32% Dockerfile 0.09% CMake 0.03%

sparser-cnn's Introduction

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Updates

(02/03/2021) Higher performance is reported by using stronger backbone model PVT.
(23/02/2021) Higher performance is reported by using stronger pretrain model DetCo.
(02/12/2020) Models and logs(R101_100pro_3x and R101_300pro_3x) are available.
(26/11/2020) Models and logs(R50_100pro_3x and R50_300pro_3x) are available.
(26/11/2020) Higher performance for Sparse R-CNN is reported by setting the dropout rate as 0.0.

Models

Method	inf_time	train_time	box AP	download
R50_100pro_3x	23 FPS	19h	42.8	model \| log
R50_300pro_3x	22 FPS	24h	45.0	model \| log
R101_100pro_3x	19 FPS	25h	44.1	model \| log
R101_300pro_3x	18 FPS	29h	46.4	model \| log

If download link is invalid, models and logs are also available in Github Release and Baidu Drive by code wt9n.

Notes

We observe about 0.3 AP noise.
The training time is on 8 GPUs with batchsize 16. The inference time is on single GPU. All GPUs are NVIDIA V100.
We use the models pre-trained on imagenet using torchvision. And we provide torchvision's ResNet-101.pkl model. More details can be found in the conversion script.

Method	inf_time	train_time	box AP	codebase
R50_300pro_3x	22 FPS	24h	45.0	detectron2
R50_300pro_3x.detco	22 FPS	28h	46.5	detectron2
PVTSmall_300pro_3x	13 FPS	50h	45.7	mmdetection
PVTv2-b2_300pro_3x	11 FPS	76h	50.1	mmdetection

Installation

The codebases are built on top of Detectron2 and DETR.

Requirements

Linux or macOS with Python ≥ 3.6
PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
OpenCV is optional and needed by demo and visualization

Steps

Install and build libs

git clone https://github.com/PeizeSun/SparseR-CNN.git
cd SparseR-CNN
python setup.py build develop

Link coco dataset path to SparseR-CNN/datasets/coco

mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

Train SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml

Evaluate SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --eval-only MODEL.WEIGHTS path/to/model.pth

Visualize SparseR-CNN

python demo/demo.py\
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --input path/to/images --output path/to/save_images --confidence-threshold 0.4 \
    --opts MODEL.WEIGHTS path/to/model.pth

Third-party resources

mmdetection implementation: sparse_rcnn. Thank Shilong Zhang!
cvpod implementation:sparse_rcnn. Thank Benjin Zhu!
paddledetection implementation:sparse_rcnn. Thank FL77N!

License

SparseR-CNN is released under MIT License.

Citing

If you use SparseR-CNN in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{peize2020sparse,
  title   =  {{SparseR-CNN}: End-to-End Object Detection with Learnable Proposals},
  author  =  {Peize Sun and Rufeng Zhang and Yi Jiang and Tao Kong and Chenfeng Xu and Wei Zhan and Masayoshi Tomizuka and Lei Li and Zehuan Yuan and Changhu Wang and Ping Luo},
  journal =  {arXiv preprint arXiv:2011.12450},
  year    =  {2020}
}

sparser-cnn's People

Contributors

Stargazers

Watchers

Forkers

liuguoyou dbofseuofhust fengxingxiang wyzhe fangwudi hanyeliu chengyuegongr wuzhihao7788 modestyjx yawudede ryanchanli lancetee yangyin2016 dataxujing cv-ip gyq716 ifighting lg12170226 qiuweibin2005 tuq820 yangyt46 ingeniousfrog wen0618 templeblock infiniterror chisyliu sweden1003 dh0000000001 lannisite110 deppmeng johnqczhang simonjjj xrosliang forthing jjdbear hrgentry shnulailai freedom521jin jwjuven bruinxiong hadryan jimmy-inl zzuiekongning piaofu110 opentld kemaloksuz happog blueardour raegher tangziyu627 junnyu linyr1125 jwyang xiaocainiaopy wolfworld6 shownx amwons zzzzzz0407 rachel-green huangwenwenlili cycle13 wh-forker dogghou hiyyg shehuimao erod-cwq funkykoki qiuweimin1332499 skyfallk dimplesl mysephi hyh21521038 warrentdrew lv-tuan guliisgreat zyg11 bearcatt likelyzhao guome zzl777 tubbz-alt jzr99 minhthangbk mengjiaxu319 hasanirtiza iostream11 zhoufyn yuhtc ayeshanirma luyao-cv zctt00 codingmakesmehappy liaorongfan dashu233 nuannuanhcc youngfly11 shigengtian xd-liu partha7827 sonnguyenasu

sparser-cnn's Issues

run the Visualize SparseR-CNN of codes : report bugs

Instructions To Reproduce the 🐛 Bug:

run the Visualize SparseR-CNN of codes:

python demo/demo.py --config-file projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml --input cocodataset/test2017 --output cocodataset/save2017 --confidence-threshold 0.2 --opts MODEL.WEIGHTS output/model_final.pth

report bugs:

[12/09 09:50:47 detectron2]: Arguments: Namespace(confidence_threshold=0.2, config_file='projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml', input=['cocodataset/test2017'], opts=['MODEL.WEIGHTS', 'output/model_final.pth'], output='cocodataset/save2017', video_input=None, webcam=False)
Traceback (most recent call last):
  File "demo/demo.py", line 77, in <module>
    cfg = setup_cfg(args)
  File "demo/demo.py", line 23, in setup_cfg
    cfg.merge_from_file(args.config_file)
  File "/home/ubuntu/Desktop/lyq/SparseR-CNN/detectron2/config/config.py", line 54, in merge_from_file
    self.merge_from_other_cfg(loaded_cfg)
  File "/root/anaconda3/envs/sparse-rcnn/lib/python3.6/site-packages/fvcore-0.1.2.post20201122-py3.6.egg/fvcore/common/config.py", line 124, in merge_from_other_cfg
  File "/root/anaconda3/envs/sparse-rcnn/lib/python3.6/site-packages/yacs-0.1.8-py3.6.egg/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/root/anaconda3/envs/sparse-rcnn/lib/python3.6/site-packages/yacs-0.1.8-py3.6.egg/yacs/config.py", line 478, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/root/anaconda3/envs/sparse-rcnn/lib/python3.6/site-packages/yacs-0.1.8-py3.6.egg/yacs/config.py", line 491, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.SparseRCNN'

3、my environment
ubutu16.04+cuda10.1+anaconda 4.5.4+python 3.4+torch 1.6.0+torchvision 1.6.0+cv2 4.4.0
gpu_nums 1

训练过程中boxes坐标前向传播结果为nan

我在训练Sparse-rcnn的过程中，在project/Sparse-rcnn/sparsercnn/box_ops.py中对应的地方抛出了boxes的坐标值不合法的错误。请问可能是什么原因导致的呢？我打印了boxes的坐标值，发现其中是4个nan，网络结构和数据集都没有修改过。训练使用的是两张2080ti，batch size是8，学习率是0.0025

maxDet in eval_only is not proposals num?

When I evaluate on a 300 proposals model, the eval info is:

The maxDet is 100 not the max proposals num, Should I change these three threshold or find another eval method?

My eval cmd is

Unbalance of number of objects in images

what should I do if I want train for dataset which the number of objects in a wide range, say 1~1000. Setting proposals number to a huge number seem not a good choice.

loss的三个组成部分源自于detr？比例因子也直接采用detr的252？

Please select an issue template from
https://github.com/facebookresearch/detectron2/issues/new/choose .

Otherwise your issue will be closed.

Confused by the interaction between ROI features and proposal features

Thx for your interesting work and the code. I am confused by some points as follow when fusing ROI feats and proposal feats:

In this setting, the transformed proposal feature serves as a key, and the transformed ROI feature is regarded as a query. Firstly, producing an attention weight between ROI feature and proposal feature. Then, combining the attention weight with the other transformed proposal feature, which is a value. In this case, none of the appearance features (ROI) are kept, the interacted and output feature is a re-weighted proposal feature. The ROI feature only plays a role to compute an attention map. I do not understand why it works, predicting bounding-box without the appearance features. Additionally, it also looks different from the original paper:

And it is inversed to the multi-head attention in Detr. @PeizeSun

Downloading the trained weights

Hi, thanks for the nice work! Could you please maybe also share your trained models to some other platforms, e.g. Baidu cloud, for those who have no access to google drive?

Please read & provide the following

Instructions To Reproduce the 🐛 Bug:

Full runnable code or full changes you made:

If making changes to the project itself, please use output of the following command:
git rev-parse HEAD; git diff

<put code or diff here>

What exact command you run:
Full logs you observed:

<put logs here>

please simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset.

Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

Environment:

Provide your environment information using the following command:

wget -nc -q https://github.com/facebookresearch/detectron2/raw/master/detectron2/utils/collect_env.py && python collect_env.py

If your issue looks like an installation issue / environment issue,
please first try to solve it yourself with the instructions in
https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues

Should use 100 proposals for evaluation for fair comparisons with other method.

It seems that you are using self.num_proposals proposals for evaluation, as shown here, while the common practice in COCO evaluation is to use 100 proposals.

I test the performance of the provided R50_300pro_3x.pth model using the top100 scoring proposals:

proposal	AP	AP50	AP75	APs	APm	APl
100	44.875	63.874	48.869	27.565	47.418	59.588
300	45.028	64.130	49.034	27.758	47.549	59.664

Though the performances are very close to those using 300 proposals, I feel they should be corrected for fair comparisons.

STRIDE_IN_1X1=True for torchvision models?

Thanks for this very intriguing work!

I found that you used torchvision pretrained models instead of MSRA pretrained ones while setting STRIDE_IN_1X1=True, which is somehow inconsistent with torchvision resnet. I reproduced a model using MSRA R-50 backbone w/ dropout=0.1, and it turned out that it is 0.7% AP lower than the reported one w/ dropout, probably indicating that differently pretrained backbones kinda matters. Maybe there can be some further improvement if this configuration inconsistency is fixed.

About the iter box regression

sparsercnn use 6 head with apply deltas to fix the bbox, and use aux loss to backward gradient to all 6 heads, all head's class weight ,l1 weight, giou weight is same, i wonder if it is a iter process, maybe we can set the weights smaller for the previous 5 head, hope for your suggesstion

How to use the sparse r-cnn code based on detectron2 to predict the object in an image？

Now I have finish the trains of detectron2, and I want to predicted the object in an image.
So I intuitively use ./demo/demo.py to predict an image.Here is my command:

python demo/demo.py --config-file ./projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml   --input ../../file_transform/000000384661.jpg ../../file_transform/000000481404.jpg --output ./mask_pre_res2/  --opts MODEL.WEIGHTS ./output/model_final.pth

But I got bug.Here is my bug:

Namespace(confidence_threshold=0.5, config_file='./projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml', input=['../../file_transform/000000384661.jpg', '../../file_transform/000000481404.jpg'], opts=['MODEL.WEIGHTS', './output/model_final.pth'], output='./mask_pre_res2/', video_input=None, webcam=False)
[01/09 15:10:33 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='./projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml', input=['../../file_transform/000000384661.jpg', '../../file_transform/000000481404.jpg'], opts=['MODEL.WEIGHTS', './output/model_final.pth'], output='./mask_pre_res2/', video_input=None, webcam=False)
Traceback (most recent call last):
  File "demo/demo.py", line 78, in <module>
    cfg = setup_cfg(args)
  File "demo/demo.py", line 23, in setup_cfg
    cfg.merge_from_file(args.config_file)
  File "/home/work/yjx/code/SparseR-CNN/detectron2/config/config.py", line 54, in merge_from_file
    self.merge_from_other_cfg(loaded_cfg)
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/fvcore/common/config.py", line 124, in merge_from_other_cfg
    return super().merge_from_other_cfg(cfg_other)
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/yacs/config.py", line 478, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/yacs/config.py", line 491, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.SparseRCNN'

Can you give me some advice on how to use the spark r-cnn code based on detectron2 to predict the object in a picture？Thank you very much！！！

question about value of base learning rate

Hi Peize
When I read the yaml file that contains learning parameters, I find the base learning rate is just 2.5e-5, which is much smaller than the learning rate (0.02) used in detectron2. I wonder why the learning rate is so small. Have you tried to train the network with larger learning rate?

Thanks

Please read & provide the following

If you do not know the root cause of the problem, and wish someone to help you, please
post according to this template:

Instructions To Reproduce the Issue:

Check https://stackoverflow.com/help/minimal-reproducible-example for how to ask good questions.
Simplify the steps to reproduce the issue using suggestions from the above link, and provide them below:

Full runnable code or full changes you made:

If making changes to the project itself, please use output of the following command:
git rev-parse HEAD; git diff

<put code or diff here>

What exact command you run:
Full logs you observed:

<put logs here>

Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

If you expect the model to converge / work better, note that we do not give suggestions
on how to train a new model.
Only in one of the two conditions we will help with it:
(1) You're unable to reproduce the results in detectron2 model zoo.
(2) It indicates a detectron2 bug.

Environment:

Provide your environment information using the following command:

wget -nc -q https://github.com/facebookresearch/detectron2/raw/master/detectron2/utils/collect_env.py && python collect_env.py

If your issue looks like an installation issue / environment issue,
please first try to solve it with the instructions in
https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues

Why the network needs proposal features/boxes?

Thanks for your great work! I have a few questions.
As described in the paper, the proposal feature is used as a sparse representation for obtaining objects from the feature map. It is used to achieve dynamic conv. and the dynamic conv. outputs the classification and regression results.

I'm wondering if the proposal feature is embedding and basically used for generating conv. params, why don't you directly use multi-branch convs, since 100 conv. branches are identical to 100 dynamic conv. with proposal feature in my opinion.

Moreover, proposal boxes are also suspicious. Since the paper mentioned iterative refinement and the feature is with the position information (coord. conv.), why not directly using the whole image as the boxes, which is the initialization method of proposal boxes in this code. That is to say, all boxes are from the whole image without proposal boxes, and processed by directly using multi-branch convs. With several rounds of refinement, I think it could also be regressed to the correct locations. In this way, the external embeddings of proposal features and boxes are no longer needed.

Any ideas?

how to get rois's grad in roi_align

@PeizeSun
hi, thanks your share code. I dont understand how to get rois's grad in roi_align,

8块卡训练coco2017 需要多久呢？

❓ How to do something using detectron2

Describe what you want to do, including:

what inputs you will provide, if any:
what outputs you are expecting:

❓ What does an API do and how to use it?

Please link to which API or documentation you're asking about from
https://detectron2.readthedocs.io/

For meaning of a config, please see
https://detectron2.readthedocs.io/modules/config.html#config-references

NOTE:

Only general answers are provided.
If you want to ask about "why X did not work" for something you did, please use the
Unexpected behaviors issue template.
About how to implement new models / new dataloader / new training logic, etc., check documentation first.
We do not answer machine learning / computer vision questions that are not specific to detectron2, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be used to achieve X.

Error when installing SparseR-cnn

Instructions To Reproduce the 🐛 Bug:

Full runnable code or full changes you made:

If making changes to the project itself, please use output of the following command:
git rev-parse HEAD; git diff

No changes were made

What exact command you run:
git clone https://github.com/PeizeSun/SparseR-CNN.git
cd SparseR-CNN
python setup.py build develop
Full logs you observed:

/usr/include/c++/7/bits/basic_string.tcc:1067:16: error: cannot call member function ‘void std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_M_set_sharable() [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]’ without object
[13/13] c++ -MMD -MF /home/hkk/objectdetection/SparseR-CNN/build/temp.linux-x86_64-3.7/home/hkk/objectdetection/SparseR-CNN/detectron2/layers/csrc/vision.o.d -pthread -B /home/hkk/miniconda3/envs/open-mmlab/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/hkk/objectdetection/SparseR-CNN/detectron2/layers/csrc -I/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include -I/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/TH -I/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/hkk/miniconda3/envs/open-mmlab/include/python3.7m -c -c /home/hkk/objectdetection/SparseR-CNN/detectron2/layers/csrc/vision.cpp -o /home/hkk/objectdetection/SparseR-CNN/build/temp.linux-x86_64-3.7/home/hkk/objectdetection/SparseR-CNN/detectron2/layers/csrc/vision.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1400, in _run_ninja_build
    check=True)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 201, in <module>
    cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 580, in build_extensions
    build_ext.build_extensions(self)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
    depends=ext.depends)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 423, in unix_wrap_ninja_compile
    with_cuda=with_cuda)
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1140, in _write_ninja_file_and_compile_objects
    error_prefix='Error compiling objects for extension')
  File "/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1413, in _run_ninja_build
    raise RuntimeError(message)
RuntimeError: Error compiling objects for extension

please simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset.

Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

Environment:

Provide your environment information using the following command:

----------------------  ---------------------------------------------------------------------------------------------
sys.platform            linux
Python                  3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0]
numpy                   1.19.1
detectron2              0.3 @/home/hkk/objectdetection/SparseR-CNN/detectron2
detectron2._C           not built correctly: No module named 'detectron2._C'
Compiler                c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CUDA compiler           Cuda compilation tools, release 10.1, V10.1.105
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.5.0 @/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch
PyTorch debug build     False
GPU available           True
GPU 0,1,2,3,4,5,6       GeForce RTX 2080 Ti (arch=7.5)
CUDA_HOME               /usr/local/cuda-10.1
Pillow                  7.2.0
torchvision             0.6.0a0+82fd1c8 @/home/hkk/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torchvision
torchvision arch flags  3.5, 5.0, 6.0, 7.0, 7.5
cv2                     4.4.0
----------------------  ---------------------------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

[Potential Bug] RuntimeError: The size of tensor a (390) must match the size of tensor b (397) at non-singleton dimension 0

I have not made any changes to the code. I am trying to train it for a single class called "Dining Table" on my custom dataset. The annotations are in COCO format and they are being loaded perfectly well (I have registered it). However when I try to run the following command:

`python projects/SparseRCNN/train_net.py --num-gpus 1     --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml`

The code breaks with with following error trail

  File "projects/SparseRCNN/train_net.py", line 138, in <module>
    launch(
  File "SparseR-CNN/detectron2/engine/launch.py", line 62, in launch
    main_func(*args)
  File "projects/SparseRCNN/train_net.py", line 130, in main
    return trainer.train()
  File "SparseR-CNN/detectron2/engine/defaults.py", line 419, in train
    super().train(self.start_iter, self.max_iter)
  File "SparseR-CNN/detectron2/engine/train_loop.py", line 134, in train
    self.run_step()
  File "SparseR-CNN/detectron2/engine/defaults.py", line 429, in run_step
    self._trainer.run_step()
  File "SparseR-CNN/detectron2/engine/train_loop.py", line 228, in run_step
    loss_dict = self.model(data)
  File "miniconda3/envs/srcnn_working/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "SparseR-CNN/projects/SparseRCNN/sparsercnn/detector.py", line 143, in forward
    loss_dict = self.criterion(output, targets)
  File "miniconda3/envs/srcnn_working/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "SparseR-CNN/projects/SparseRCNN/sparsercnn/loss.py", line 161, in forward
    losses.update(self.get_loss(loss, outputs, targets, indices, num_boxes))
  File "SparseR-CNN/projects/SparseRCNN/sparsercnn/loss.py", line 137, in get_loss
    return loss_map[loss](outputs, targets, indices, num_boxes, **kwargs)
  File "SparseR-CNN/projects/SparseRCNN/sparsercnn/loss.py", line 110, in loss_boxes
    src_boxes_ = src_boxes / image_size
RuntimeError: The size of tensor a (390) must match the size of tensor b (397) at non-singleton dimension 0

Dimensions of both src_boxes and image_size are:

image_size torch.Size([397, 4])
src_boxes torch.Size([390, 4])

I am not sure if it is because you have hard coded something somewhere (configs etc.) ? Could you please direct me.

about pytorch Version

Can I expect a pytorch version which is not rely on Detectron2？

What is the "instance interaction"?

Hi, I do not know what is the "instance interaction" in Table 4.
It is just explained in a few words and could you tell me what it is with more detail?

the ap is higher than 1,why?

category	AP	category	AP	category	AP
xxxxx	1.891	yyyyy	0.838	zzzzz	0.634

How to use the sparse r-cnn code based on detectron2 to predict the object in an image？

Now I have finish the trains of detectron2, and I want to predicted the object in an image.
So I intuitively use ./demo/demo.py to predict an image.Here is my command:

python demo/demo.py --config-file ./projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml   --input ../../file_transform/000000384661.jpg ../../file_transform/000000481404.jpg --output ./mask_pre_res2/  --opts MODEL.WEIGHTS ./output/model_final.pth

But I got bug.Here is my bug:

Namespace(confidence_threshold=0.5, config_file='./projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml', input=['../../file_transform/000000384661.jpg', '../../file_transform/000000481404.jpg'], opts=['MODEL.WEIGHTS', './output/model_final.pth'], output='./mask_pre_res2/', video_input=None, webcam=False)
[01/09 15:10:33 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='./projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml', input=['../../file_transform/000000384661.jpg', '../../file_transform/000000481404.jpg'], opts=['MODEL.WEIGHTS', './output/model_final.pth'], output='./mask_pre_res2/', video_input=None, webcam=False)
Traceback (most recent call last):
  File "demo/demo.py", line 78, in <module>
    cfg = setup_cfg(args)
  File "demo/demo.py", line 23, in setup_cfg
    cfg.merge_from_file(args.config_file)
  File "/home/work/yjx/code/SparseR-CNN/detectron2/config/config.py", line 54, in merge_from_file
    self.merge_from_other_cfg(loaded_cfg)
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/fvcore/common/config.py", line 124, in merge_from_other_cfg
    return super().merge_from_other_cfg(cfg_other)
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/yacs/config.py", line 478, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/work/anaconda3/envs/yjx/lib/python3.6/site-packages/yacs/config.py", line 491, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.SparseRCNN'

Can you give me some advice on how to use the spark r-cnn code based on detectron2 to predict the object in a picture？Thank you very much！！！

Can you add a simple tutorial about custom dataset?

I followed the Detectron2 to register my dataset. But met the error like this:

[12/25 15:34:51] detectron2.engine.train_loop ERROR: Exception during training:
Traceback (most recent call last):
File "/home/wxy/SparseR-CNN/detectron2/engine/train_loop.py", line 134, in train
self.run_step()
File "/home/wxy/SparseR-CNN/detectron2/engine/defaults.py", line 429, in run_step
self._trainer.run_step()
File "/home/wxy/SparseR-CNN/detectron2/engine/train_loop.py", line 228, in run_step
loss_dict = self.model(data)
File "/home/wxy/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wxy/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/wxy/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wxy/SparseR-CNN/projects/SparseRCNN/sparsercnn/detector.py", line 143, in forward
loss_dict = self.criterion(output, targets)
File "/home/wxy/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in call_impl
result = self.forward(*input, **kwargs)
File "/home/wxy/SparseR-CNN/projects/SparseRCNN/sparsercnn/loss.py", line 159, in forward
losses.update(self.get_loss(loss, outputs, targets, indices, num_boxes))
File "/home/wxy/SparseR-CNN/projects/SparseRCNN/sparsercnn/loss.py", line 135, in get_loss
return loss_map[loss](outputs, targets, indices, num_boxes, **kwargs)
File "/home/wxy/SparseR-CNN/projects/SparseRCNN/sparsercnn/loss.py", line 108, in loss_boxes
src_boxes = src_boxes / image_size
RuntimeError: The size of tensor a (168) must match the size of tensor b (170) at non-singleton dimension 0

.
I want to know exactly which file and parameters should be modified in your project.I'am very interested in your paper.Please help if you are not too busy.

Train code used is different with demo.py code

I trained my model in other`s environment(also have a copy of Sparse RCNN)where it is compiled, when I run demo/demo.py, it used model code(like dectector.py) from environment not my device.

Does anyone have any clues or met it before?

inference phase:

train phase:

An error was encountered while the demo was running

run the Visualize SparseR-CNN of codes:

python demo.py --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml --input 2.jpg --output test.jpg --confidence-threshold 0.4 --opts MODEL.WEIGHTS r50_100pro_3x_model.pth.qbl

Meet bug:
D:\NANa\SparseR-CNN-main>python demo.py --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml --input 2.jpg --output test.jpg --confidence-threshold 0.4 --opts MODEL.WEIGHTS r50_100pro_3x_model.pth.qbl
[01/04 18:07:38 detectron2]: Arguments: Namespace(confidence_threshold=0.4, config_file='projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml', input=['2.jpg'], opts=['MODEL.WEIGHTS', 'r50_100pro_3x_model.pth.qbl'], output='test.jpg', video_input=None, webcam=False)
[01/04 18:07:40 fvcore.common.checkpoint]: Loading checkpoint from r50_100pro_3x_model.pth.qbl
Traceback (most recent call last):
File "demo.py", line 80, in
demo = VisualizationDemo(cfg)
File "D:\NANa\SparseR-CNN-main\predictor.py", line 35, in init
self.predictor = DefaultPredictor(cfg)
File "D:\anaconda\envs\p171\lib\site-packages\detectron2\engine\defaults.py", line 194, in init
checkpointer.load(cfg.MODEL.WEIGHTS)
File "D:\anaconda\envs\p171\lib\site-packages\fvcore\common\checkpoint.py", line 122, in load
checkpoint = self._load_file(path)
File "D:\anaconda\envs\p171\lib\site-packages\detectron2\checkpoint\detection_checkpoint.py", line 54, in _load_file
loaded = super()._load_file(filename) # load native pth checkpoint
File "D:\anaconda\envs\p171\lib\site-packages\fvcore\common\checkpoint.py", line 219, in _load_file
return torch.load(f, map_location=torch.device("cpu"))
File "D:\anaconda\envs\p171\lib\site-packages\torch\serialization.py", line 587, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "D:\anaconda\envs\p171\lib\site-packages\torch\serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory

my environment
win10+cuda11+python 3.7+torch 1.7.1

How can change the code used in demo.py?

I have two versions code like original SparseRCNN and SparseHuman. When I use demo/demo.py to test， where can I try to make it run in SparseHuman instead of default SparseRCNN ?

请问怎样训练PASCAL VOC2007 ，VOC2012呢？ tks

❓ How to do something using detectron2

Describe what you want to do, including:

what inputs you will provide, if any:
what outputs you are expecting:

❓ What does an API do and how to use it?

Please link to which API or documentation you're asking about from
https://detectron2.readthedocs.io/

For meaning of a config, please see
https://detectron2.readthedocs.io/modules/config.html#config-references

NOTE:

Only general answers are provided.
If you want to ask about "why X did not work" for something you did, please use the
Unexpected behaviors issue template.
About how to implement new models / new dataloader / new training logic, etc., check documentation first.
We do not answer machine learning / computer vision questions that are not specific to detectron2, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be used to achieve X.

训练代码

请教下：RuntimeError: [enforce fail at /pytorch/third_party/gloo/gloo/transport/tcp/http://device.cc:208] ifa != nullptr. Unable to find interface for: [10.16.32.68]，这是什么问题，在运行github训练代码时：python projects/SparseRCNN/train_net.py --num-gpus 4
--config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml
--eval-only MODEL.WEIGHTS path/to/model.pth

多卡训练报错

您好！我用训练的命令进行了训练：python projects/SparseRCNN/train_net.py --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml --num-gpus 4 --gpu "0, 1, 2，3".目前是单机4卡训练。但是训练时报如下错误：
RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1595629416375/work/third_party/gloo/gloo/transport/tcp/device.cc:208] ifa != nullptr. Unable to find interface for: [0.0.8.34]
请问该如何解决啊！！！

如何负采样及负样例的loss在box那里怎么算？

❓ How to do something using detectron2

Describe what you want to do, including:

what inputs you will provide, if any:
what outputs you are expecting:

❓ What does an API do and how to use it?

Please link to which API or documentation you're asking about from
https://detectron2.readthedocs.io/

For meaning of a config, please see
https://detectron2.readthedocs.io/modules/config.html#config-references

NOTE:

Only general answers are provided.
If you want to ask about "why X did not work" for something you did, please use the
Unexpected behaviors issue template.
About how to implement new models / new dataloader / new training logic, etc., check documentation first.
We do not answer machine learning / computer vision questions that are not specific to detectron2, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be used to achieve X.

assert (boxes1[:, 2:] >= boxes1[:, :2]).all()

when i train my own data,report err assert (boxes1[:, 2:] >= boxes1[:, :2]).all() -> cost_giou = -generalized_box_iou(out_bbox, tgt_bbox).but the boxes1 is out_bbox.the same data were trained in detectron2 is ok.thanks

How can I check the learned bboxes?

I tried to draw learned boxes in detector forward in inference phase when it is before first cascade starts but it looked like wired that all boxes are located in the boundary of whole IMG.
I used the official coco res50_300_3x weights.

pos_cost_class and neg_cost_class

Hello, I come from #8

I'm a little confused.
A similar effect can be achieved by using pos_cost alone, without considering the experiment
How did you come up with the idea of adding this when you found that the experiment didn't work

bugs when train

Instructions To Reproduce the 🐛 Bug:

Full runnable code or full changes you made:

python projects/SparseR-CNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseR-CNN/configs/sparsercnn.res50.100pro.3x.yaml

Full logs you observed:

Traceback (most recent call last):
  File "projects/SparseR-CNN/train_net.py", line 22, in <module>
    from detectron2.data import MetadataCatalog, build_detection_train_loader
  File "/home/amax/ngrok/opencv/SparseR-CNN/detectron2/data/__init__.py", line 4, in <module>
    from .build import (
  File "/home/amax/ngrok/opencv/SparseR-CNN/detectron2/data/build.py", line 12, in <module>
    from detectron2.structures import BoxMode
  File "/home/amax/ngrok/opencv/SparseR-CNN/detectron2/structures/__init__.py", line 7, in <module>
    from .masks import BitMasks, PolygonMasks, rasterize_polygons_within_box, polygons_to_bitmask
  File "/home/amax/ngrok/opencv/SparseR-CNN/detectron2/structures/masks.py", line 9, in <module>
    from detectron2.layers.roi_align import ROIAlign
  File "/home/amax/ngrok/opencv/SparseR-CNN/detectron2/layers/__init__.py", line 3, in <module>
    from .deform_conv import DeformConv, ModulatedDeformConv
  File "/home/amax/ngrok/opencv/SparseR-CNN/detectron2/layers/deform_conv.py", line 11, in <module>
    from detectron2 import _C
ImportError: /home/amax/ngrok/opencv/SparseR-CNN/detectron2/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceISt7complexIdEEEPKNS_6detail12TypeMetaDataEv

Environment:

Provide your environment information using the following command:

----------------------  ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sys.platform            linux
Python                  3.6.10 |Anaconda, Inc.| (default, May  8 2020, 02:54:21) [GCC 7.3.0]
numpy                   1.19.1
detectron2              0.3 @/home/amax/ngrok/opencv/SparseR-CNN/detectron2
detectron2._C           not built correctly: /home/amax/ngrok/opencv/SparseR-CNN/detectron2/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceISt7complexIdEEEPKNS_6detail12TypeMetaDataEv
Compiler                c++ (Ubuntu 6.5.0-2ubuntu1~16.04) 6.5.0 20181026
CUDA compiler           Cuda compilation tools, release 10.2, V10.2.89
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.7.0 @/home/amax/anaconda3/envs/pcdet/lib/python3.6/site-packages/torch
PyTorch debug build     True
GPU available           True
GPU 0,1                 GeForce RTX 2080 Ti (arch=7.5)
CUDA_HOME               /usr/local/cuda-10.2
Pillow                  7.2.0
torchvision             0.8.1 @/home/amax/anaconda3/envs/pcdet/lib/python3.6/site-packages/torchvision
torchvision arch flags  3.5, 5.0, 6.0, 7.0, 7.5
fvcore                  0.1.2.post20201122
cv2                     Not found
----------------------  ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

Self-attention vs. MultiHeadAttention

I am confused about the difference between Self-attention in table 4 and multi-head-attention in table 8.
Because in this line, you assign multi-head-attention to self_attn.

Can't install the setup.py

$: python setup.py build developTraceback (most recent call last):
File "setup.py", line 12, in
from torch.utils.hipify import hipify_python
ModuleNotFoundError: No module named 'torch.utils.hipify'
My pytorch version is 1.5 .So what should I do?

Does it take background proposals cls loss into total cls loss?

I dont see any clue about calculate background cls loss in paper, and also I checked the loss.py ,didnt find it. However, in DERT the author counts 0.1*background cls loss in total cls loss. I am curious about it.

negtative coordinate

the out_coord has negtive_value, which makes the loss l1 and giou very big, is it common? Did i miss some operations about out_coord, like sigmoid in detr?
i look into the code, the negtive value is from apply_deltas,
pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w # x1
pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h # y1
results in negtive value

用rrpn的方式训练SparseRCNN

能不能用SparseRCNN训练带旋转角度的模型

how to resume training

Hi,when i try to resume training:
--resume output/
AssertionError: Override list has odd length: ['output/']; it must be a list of pairs.
So, how to resume training?

Please read & provide the following

How the refinement and de-duplication is performed?

I noticed that the paper mentions an iterative training by feeding the previously predicted bbox and feature. Could you help identify which part of the code corresponds to this. And may I ask how deduplication is achieved in this process, since the paper claims to be free of NMS?

AssertionError

Cannot reproduce the results for Res50

Hi
When I implement the code with yaml file sparsercnn.res50.100pro.3x.yaml and sparsercnn.res50.300pro.3x.yaml, I can only get mAP 40.8 and 42.5 respectively, both of which are about 2% worse than the reported results. I use 8 GPUS with batch size 16. I think the learning hyper parameters and initialization are the same as yours. Could you give me some hint about why the potential problem might be?

Thanks

Why does the "cost_class" include two items, i.e. pos_cost_class and neg_cost_class?

SparseR-CNN/projects/SparseR-CNN/sparsercnn/loss.py

Line 251 in daf8630

cost_class = pos_cost_class[:, tgt_ids] - neg_cost_class[:, tgt_ids]

Why does the "cost_class" include two items, i.e. pos_cost_class and neg_cost_class? While if self.use_focal == False, cost_class = -out_prob[:, tgt_ids], which includes only one item and corresponds to "pos_cost_class".

result not as good as expected?

thresh set as default....

/data/SparseR-CNN/detectron2/_C.cpython-36m-x86_64-linux-gnu.so does not contain device code

Loading checkpoint from detectron2://ImageNetPretrained/torchvision/R-50.pkl is very slow.

nan in boxes when training

I have setup the environment according to the repository instructions and train a model using my own custom dataset.
The training always encounter the following error:

File "SparseR-CNN/projects/SparseRCNN/sparsercnn/util/box_ops.py", line 51, in generalized_box_iou
    assert (boxes1[:, 2:] >= boxes1[:, :2]).all(), boxes1
AssertionError: tensor([[nan, nan, nan, nan],
        [nan, nan, nan, nan],
        [nan, nan, nan, nan],
        ...,
        [nan, nan, nan, nan],
        [nan, nan, nan, nan],
        [nan, nan, nan, nan]], device='cuda:0', requires_grad=True
)

As you can see, I print out the boxes when encountering the error and it has some nan values.

I have tried to decrease learning rate from 0.01 to 0.005 and even 0.0025, but the error is still there.

So could you help me to fix it? Thank you.

About custome dataset

do you have any suggestions on how to train on a custom dataset? Is sparse r-cnn need at least ~15k images to converge like detr, will sparse r-cnn work on a small dataset?

peizesun / sparser-cnn Goto Github PK

sparser-cnn's Introduction

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Updates

Models

Notes

Installation

Requirements

Steps

Third-party resources

License

Citing

sparser-cnn's People

Contributors

Stargazers

Watchers

Forkers

sparser-cnn's Issues

Instructions To Reproduce the 🐛 Bug:

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

Instructions To Reproduce the Issue:

Expected behavior:

Environment:

❓ How to do something using detectron2

❓ What does an API do and how to use it?

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

How to use the sparse r-cnn code based on detectron2 to predict the object in an image？

❓ How to do something using detectron2

❓ What does an API do and how to use it?

❓ How to do something using detectron2

❓ What does an API do and how to use it?

Instructions To Reproduce the 🐛 Bug:

Environment:

Recommend Projects

Recommend Topics

Recommend Org