vdigpku / cbnetv2 Goto Github PK

View Code? Open in Web Editor NEW

367.0 367.0 67.0 22.19 MB

[TIP 2022] CBNetV2: A Composite Backbone Network Architecture for Object Detection

License: Apache License 2.0

Python 98.66% Shell 1.28% Dockerfile 0.06%

cbnetv2's People

Contributors

Stargazers

Watchers

Forkers

repo-collection messier202 cv-ip houxiaonan tae898 piaofu110 luisfra19 neelimachinta lihaossu objectdetection mtcld cvkuala onestep00 class8hawk maxkinny222 khurramhashmi jireh-father t1murko bonnie404 markhsia whystopped yvanyin minouei-kl charlesnj kangzf1996 vikasrajashekar zephyrii hermar98 chenxinfeng4 w844718456 mohsinkhn innovationlab-top jinfanhahaha jahongir7174 scott-mao wangxupeng ikwus pgsrv zhanglaplace kishank18 kartikwar hwijune senwang98 lulibinglearn taokong yangrisheng lansfair amirdev83 19990101lrk lzhgrla zivzone psgame triple-mu haruhonda cv-det hplegend yangyahu-1994 thanhtran98 marenan kvnptl superrichiesui richiesui lizhe1531 im98tyx fvuff1314 menguangwen0411

cbnetv2's Issues

model loading issues

RuntimeError: [enforce fail at inline_container.cc:108] . file in archive is not in a subdirectory: htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth

Hi, thank you very much for your great work.
I have a question that the htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_adamw_20e.py is inherited from htc_without_semantic_swin_fpn.py that illiterately does not use semantic. If I command out the roi_head to remove the semantic_roi_extractor and semantic_head, does it affect the final accuracy.
My datasets have mask annotations info but I didn't convert them to .pngs, are they necessary?

where is the config of "20e + 1x (swa)"? (HTC (1600x1400 test))

thks

got it

thks

how to train DB-Swin-L?

if I want to train DB-Swin-L, Do I just need to change the schedule to 20 epoch，just like DB-Swin-B.

Errors while converting model to onnx

Hi,

I am trying to convert the pretrained model into ONNX format and failing to convert it. Please find below the errors.

Exception has occurred: AttributeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
'HybridTaskCascadeRoIHead' object has no attribute 'onnx_export'
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1131, in getattr
type(self).name, name))
File "/CBNetV2/mmdet/models/detectors/two_stage.py", line 201, in onnx_export
return self.roi_head.onnx_export(x, proposals, img_metas)
File "/CBNetV2/mmdet/models/detectors/base.py", line 168, in forward
return self.onnx_export(img[0], img_metas[0])
File "/envs/cbnet_v2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
return old_func(*args, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1039, in _slow_forward
result = self.forward(*input, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/jit/_trace.py", line 132, in forward
self._force_outplace,
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/jit/_trace.py", line 1160, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 373, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 422, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 459, in _model_to_graph
_retain_param_name)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 695, in _export
dynamic_axes=dynamic_axes)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 94, in export
use_external_data_format=use_external_data_format)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/init.py", line 280, in export
custom_opsets, enable_onnx_checker, use_external_data_format)
File "/CBNetV2/tools/deployment/pytorch2onnx.py", line 78, in pytorch2onnx
dynamic_axes=dynamic_axes)
File "/CBNetV2/tools/deployment/pytorch2onnx.py", line 312, in
dynamic_export=args.dynamic_export)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 193, in _run_module_as_main (Current frame)
"main", mod_spec)

为什么我测试所有ap都为0

FileNotFoundError: [Errno 2] No such file or directory: 'data/coco/stuffthingmaps/train2017/xyz.png'

I have searched related issues but cannot get the expected help.

I want to train my custom dataset for instance segmentation using Improved HTC with DB-Swin-L as backbone. But I am facing the above error. Since it is an instance segmentation dataset, I don't have stuffthingmaps. Kindly help me as to how should I go about it.

I get the following upon training on Google colab:

2021-08-03 18:28:25,774 - mmdet - INFO - Environment info:

sys.platform: linux
Python: 3.7.11 (default, Jul 3 2021, 18:01:19) [GCC 7.5.0]
CUDA available: True
GPU 0: Tesla T4
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.0_bu.TC445_37.28845127_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0+cu102
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 10.2
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
CuDNN 7.6.5
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0+cu102
OpenCV: 4.1.2
MMCV: 1.3.9
MMCV Compiler: GCC 7.5
MMCV CUDA Compiler: 11.0
MMDetection: 2.14.0+900f7bd

2021-08-03 18:28:26,338 - mmdet - INFO - Distributed training: False
2021-08-03 18:28:26,893 - mmdet - INFO - Config:
model = dict(
type='HybridTaskCascade',
pretrained=None,
backbone=dict(
type='CBSwinTransformer',
embed_dim=192,
depths=[2, 2, 18, 2],
num_heads=[6, 12, 24, 48],
window_size=7,
mlp_ratio=4.0,
qkv_bias=True,
qk_scale=None,
drop_rate=0.0,
attn_drop_rate=0.0,
drop_path_rate=0.2,
ape=False,
patch_norm=True,
out_indices=(0, 1, 2, 3),
use_checkpoint=False),
neck=dict(
type='CBFPN',
in_channels=[192, 384, 768, 1536],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='HybridTaskCascadeRoIHead',
interleaved=True,
mask_info_flow=True,
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=[
dict(
type='HTCMaskHead',
with_conv_res=False,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
],
semantic_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[8]),
semantic_head=dict(
type='FusedSemanticHead',
num_ins=5,
fusion_level=1,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=183,
ignore_label=255,
loss_weight=0.2)),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)
]),
test_cfg=dict(
rpn=dict(
nms_pre=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.001,
nms=dict(type='soft_nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)))
dataset_type = 'COCODataset'
data_root = 'data/coco/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
dict(
type='Resize',
img_scale=[(1600, 400), (1600, 1400)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1600, 1400),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=1,
workers_per_gpu=1,
train=dict(
type='CocoDataset',
ann_file='data/trainval.json',
img_prefix='data/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
with_seg=True),
dict(
type='Resize',
img_scale=[(1600, 400), (1600, 1400)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'gt_masks',
'gt_semantic_seg'
])
],
seg_prefix='data/coco/stuffthingmaps/train2017/',
classes=('date', 'fig', 'hazelnut')),
val=dict(
type='CocoDataset',
ann_file='data/trainval.json',
img_prefix='data/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1600, 1400),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('date', 'fig', 'hazelnut')),
test=dict(
type='CocoDataset',
ann_file='data/trainval.json',
img_prefix='data/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1600, 1400),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('date', 'fig', 'hazelnut')))
evaluation = dict(metric=['bbox', 'segm'])
optimizer = dict(
type='AdamW',
lr=5e-05,
betas=(0.9, 0.999),
weight_decay=0.05,
paramwise_cfg=dict(
custom_keys=dict(
absolute_pos_embed=dict(decay_mult=0.0),
relative_position_bias_table=dict(decay_mult=0.0),
norm=dict(decay_mult=0.0))))
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = 'htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth'
resume_from = None
workflow = [('train', 1)]
samples_per_gpu = 1
classes = ('date', 'fig', 'hazelnut')
work_dir = './work_dirs/nuts'
gpu_ids = range(0, 1)

/content/CBNetV2/mmdet/core/anchor/builder.py:16: UserWarning: build_anchor_generator would be deprecated soon, please use build_prior_generator
'build_anchor_generator would be deprecated soon, please use '
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2021-08-03 18:28:38,797 - mmdet - INFO - load checkpoint from htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth
2021-08-03 18:28:38,798 - mmdet - INFO - Use load_from_local loader
2021-08-03 18:29:29,361 - mmdet - WARNING - The model and loaded state dict do not match exactly

size mismatch for roi_head.bbox_head.0.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.0.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.bbox_head.1.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.1.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.bbox_head.2.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.2.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.mask_head.0.conv_logits.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for roi_head.mask_head.0.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for roi_head.mask_head.1.conv_logits.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for roi_head.mask_head.1.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for roi_head.mask_head.2.conv_logits.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for roi_head.mask_head.2.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
unexpected key in source state_dict: roi_head.bbox_head.0.shared_convs.0.conv.weight, roi_head.bbox_head.0.shared_convs.0.bn.weight, roi_head.bbox_head.0.shared_convs.0.bn.bias, roi_head.bbox_head.0.shared_convs.0.bn.running_mean, roi_head.bbox_head.0.shared_convs.0.bn.running_var, roi_head.bbox_head.0.shared_convs.0.bn.num_batches_tracked, roi_head.bbox_head.0.shared_convs.1.conv.weight, roi_head.bbox_head.0.shared_convs.1.bn.weight, roi_head.bbox_head.0.shared_convs.1.bn.bias, roi_head.bbox_head.0.shared_convs.1.bn.running_mean, roi_head.bbox_head.0.shared_convs.1.bn.running_var, roi_head.bbox_head.0.shared_convs.1.bn.num_batches_tracked, roi_head.bbox_head.0.shared_convs.2.conv.weight, roi_head.bbox_head.0.shared_convs.2.bn.weight, roi_head.bbox_head.0.shared_convs.2.bn.bias, roi_head.bbox_head.0.shared_convs.2.bn.running_mean, roi_head.bbox_head.0.shared_convs.2.bn.running_var, roi_head.bbox_head.0.shared_convs.2.bn.num_batches_tracked, roi_head.bbox_head.0.shared_convs.3.conv.weight, roi_head.bbox_head.0.shared_convs.3.bn.weight, roi_head.bbox_head.0.shared_convs.3.bn.bias, roi_head.bbox_head.0.shared_convs.3.bn.running_mean, roi_head.bbox_head.0.shared_convs.3.bn.running_var, roi_head.bbox_head.0.shared_convs.3.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.0.conv.weight, roi_head.bbox_head.1.shared_convs.0.bn.weight, roi_head.bbox_head.1.shared_convs.0.bn.bias, roi_head.bbox_head.1.shared_convs.0.bn.running_mean, roi_head.bbox_head.1.shared_convs.0.bn.running_var, roi_head.bbox_head.1.shared_convs.0.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.1.conv.weight, roi_head.bbox_head.1.shared_convs.1.bn.weight, roi_head.bbox_head.1.shared_convs.1.bn.bias, roi_head.bbox_head.1.shared_convs.1.bn.running_mean, roi_head.bbox_head.1.shared_convs.1.bn.running_var, roi_head.bbox_head.1.shared_convs.1.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.2.conv.weight, roi_head.bbox_head.1.shared_convs.2.bn.weight, roi_head.bbox_head.1.shared_convs.2.bn.bias, roi_head.bbox_head.1.shared_convs.2.bn.running_mean, roi_head.bbox_head.1.shared_convs.2.bn.running_var, roi_head.bbox_head.1.shared_convs.2.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.3.conv.weight, roi_head.bbox_head.1.shared_convs.3.bn.weight, roi_head.bbox_head.1.shared_convs.3.bn.bias, roi_head.bbox_head.1.shared_convs.3.bn.running_mean, roi_head.bbox_head.1.shared_convs.3.bn.running_var, roi_head.bbox_head.1.shared_convs.3.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.0.conv.weight, roi_head.bbox_head.2.shared_convs.0.bn.weight, roi_head.bbox_head.2.shared_convs.0.bn.bias, roi_head.bbox_head.2.shared_convs.0.bn.running_mean, roi_head.bbox_head.2.shared_convs.0.bn.running_var, roi_head.bbox_head.2.shared_convs.0.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.1.conv.weight, roi_head.bbox_head.2.shared_convs.1.bn.weight, roi_head.bbox_head.2.shared_convs.1.bn.bias, roi_head.bbox_head.2.shared_convs.1.bn.running_mean, roi_head.bbox_head.2.shared_convs.1.bn.running_var, roi_head.bbox_head.2.shared_convs.1.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.2.conv.weight, roi_head.bbox_head.2.shared_convs.2.bn.weight, roi_head.bbox_head.2.shared_convs.2.bn.bias, roi_head.bbox_head.2.shared_convs.2.bn.running_mean, roi_head.bbox_head.2.shared_convs.2.bn.running_var, roi_head.bbox_head.2.shared_convs.2.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.3.conv.weight, roi_head.bbox_head.2.shared_convs.3.bn.weight, roi_head.bbox_head.2.shared_convs.3.bn.bias, roi_head.bbox_head.2.shared_convs.3.bn.running_mean, roi_head.bbox_head.2.shared_convs.3.bn.running_var, roi_head.bbox_head.2.shared_convs.3.bn.num_batches_tracked

missing keys in source state_dict: roi_head.bbox_head.0.shared_fcs.1.weight, roi_head.bbox_head.0.shared_fcs.1.bias, roi_head.bbox_head.1.shared_fcs.1.weight, roi_head.bbox_head.1.shared_fcs.1.bias, roi_head.bbox_head.2.shared_fcs.1.weight, roi_head.bbox_head.2.shared_fcs.1.bias

2021-08-03 18:29:29,409 - mmdet - INFO - Start running, host: root@d8f8e57ec13b, work_dir: /content/CBNetV2/work_dirs/nuts
2021-08-03 18:29:29,409 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(VERY_LOW ) TextLoggerHook

before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) EvalHook
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook

after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(VERY_LOW ) TextLoggerHook

before_val_epoch:
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_val_iter:
(LOW ) IterTimerHook

after_val_iter:
(LOW ) IterTimerHook

after_val_epoch:
(VERY_LOW ) TextLoggerHook

2021-08-03 18:29:29,409 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
Traceback (most recent call last):
File "tools/train.py", line 188, in
main()
File "tools/train.py", line 184, in main
meta=meta)
File "/content/CBNetV2/mmdet/apis/train.py", line 185, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/CBNetV2/mmdet/datasets/custom.py", line 194, in getitem
data = self.prepare_train_img(idx)
File "/content/CBNetV2/mmdet/datasets/custom.py", line 217, in prepare_train_img
return self.pipeline(results)
File "/content/CBNetV2/mmdet/datasets/pipelines/compose.py", line 40, in call
data = t(data)
File "/content/CBNetV2/mmdet/datasets/pipelines/loading.py", line 373, in call
results = self._load_semantic_seg(results)
File "/content/CBNetV2/mmdet/datasets/pipelines/loading.py", line 347, in _load_semantic_seg
img_bytes = self.file_client.get(filename)
File "/usr/local/lib/python3.7/dist-packages/mmcv/fileio/file_client.py", line 306, in get
return self.client.get(filepath)
File "/usr/local/lib/python3.7/dist-packages/mmcv/fileio/file_client.py", line 184, in get
with open(filepath, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'data/coco/stuffthingmaps/train2017/11.png'

My config file:
base = '../cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.py'

model = dict(
roi_head=dict(
bbox_head=[
dict(
type='ConvFCBBoxHead',
num_shared_convs=4,
num_shared_fcs=1,
in_channels=256,
conv_out_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
reg_decoded_bbox=True,
norm_cfg=dict(type='SyncBN', requires_grad=True),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=10.0)),
dict(
type='ConvFCBBoxHead',
num_shared_convs=4,
num_shared_fcs=1,
in_channels=256,
conv_out_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
reg_decoded_bbox=True,
norm_cfg=dict(type='SyncBN', requires_grad=True),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=10.0)),
dict(
type='ConvFCBBoxHead',
num_shared_convs=4,
num_shared_fcs=1,
in_channels=256,
conv_out_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
reg_decoded_bbox=True,
norm_cfg=dict(type='SyncBN', requires_grad=True),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=10.0))
]
)
)

model = dict(
type='HybridTaskCascade',
pretrained=None,
roi_head=dict(
type='HybridTaskCascadeRoIHead',
interleaved=True,
mask_info_flow=True,
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=[
dict(
type='HTCMaskHead',
with_conv_res=False,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
],
))

dataset_type = 'COCODataset'
classes = ('date','fig','hazelnut',)
data = dict(
train=dict(
img_prefix='data/images/',
classes=classes,
ann_file='data/trainval.json'),
val=dict(
img_prefix='data/images/',
classes=classes,
ann_file='data/trainval.json'),
test=dict(
img_prefix='data/images/',
classes=classes,
ann_file='data/trainval.json'))

load_from = 'htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth'

Single GPU training

hi, thanks for sharing your model, is it possible to train this model on custom dataset with single GPU?, whenever i try to do that, getting this error(im using tools/train.py script):
Traceback (most recent call last): File "CBNetV2/tools/train.py", line 188, in <module> main() File "CBNetV2/tools/train.py", line 184, in main meta=meta) File "/content/CBNetV2/mmdet/apis/train.py", line 185, in train_detector runner.run(data_loaders, cfg.workflow) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/parallel/data_parallel.py", line 67, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/content/CBNetV2/mmdet/models/detectors/base.py", line 237, in train_step losses = self(**data) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/fp16_utils.py", line 128, in new_func output = old_func(*new_args, **new_kwargs) File "/content/CBNetV2/mmdet/models/detectors/base.py", line 171, in forward return self.forward_train(img, img_metas, **kwargs) File "/content/CBNetV2/mmdet/models/detectors/two_stage.py", line 266, in forward_train **kwargs) File "/content/CBNetV2/mmdet/models/roi_heads/cascade_roi_head.py", line 248, in forward_train rcnn_train_cfg) File "/content/CBNetV2/mmdet/models/roi_heads/cascade_roi_head.py", line 146, in _bbox_forward_train bbox_results = self._bbox_forward(stage, x, rois) File "/content/CBNetV2/mmdet/models/roi_heads/cascade_roi_head.py", line 136, in _bbox_forward cls_score, bbox_pred = bbox_head(bbox_feats) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/content/CBNetV2/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py", line 155, in forward x = conv(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/cnn/bricks/conv_module.py", line 201, in forward x = self.norm(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/batchnorm.py", line 731, in forward world_size = torch.distributed.get_world_size(process_group) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 748, in get_world_size return _get_group_size(group) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 274, in _get_group_size default_pg = _get_default_group() File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 358, in _get_default_group raise RuntimeError("Default process group has not been initialized, " RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

inference speed is too slow : 360*640 Picture takes 1s

The pre-trained model used DB-Swin-B
P40, cuda9.2，360*640 Picture takes 1s，it is too slow too slow too slow !

Ask for DB-Swin-L (TTA) and DB-Swin-L config and weights

Hi, authors, really great work, can you provide the configuration files and weights for DB-Swin-L (TTA) and DB-Swin-L?

Where is the source code of `CBSwinTransformerv2`?

According to the config file Mode zoo - Transformer-based (w/ ImageNet-1k pretrained) - Mask R-CNN, where is the source code of CBSwinTransformerv2?

If I understood it right all the algorithms are mask based right except for Faster-Rcnn model, right? Can I use the cascade-mask-rcnn model for just object detection, i.e., to train the model without segmentation annotations?

rpn head

How to disable semantic mask since I only want to do object detection?

When I simply "python tools/trian.py configs/cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400 1400_giou_4conv1f_adamw_1x_coco.py", I get the error:

Traceback (most recent call last):
  File "tools/train.py", line 188, in <module>
    main()
  File "tools/train.py", line 184, in main
    meta=meta)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/apis/train.py", line 185, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/detectors/base.py", line 237, in train_step
    losses = self(**data)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 128, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/detectors/base.py", line 171, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/detectors/two_stage.py", line 266, in forward_train
    **kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/roi_heads/htc_roi_head.py", line 245, in forward_train
    loss_seg = self.semantic_head.loss(semantic_pred, gt_semantic_seg)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/roi_heads/mask_heads/fused_semantic_head.py", line 103, in loss
    labels = labels.squeeze(1).long()
AttributeError: 'NoneType' object has no attribute 'squeeze'

It seems to caused by semantic mask. How could I fix it?

RecursionError: maximum recursion depth exceeded while calling a Python object

Tried increasing the recursion limit, but in vain. I was trying to run the big model, HTC Improved - DB-Swin-B.
Any ideas? Or experience?

About DB-Swin-B model, I tried to reimplementate it and get 55.2 mAP

I re-trained DB-Swin-B model on coco dataset. The config file I use is configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.py, and the pretrained model I use is swin-base model trained on 22k imagenet, 384 input size. But the mAP I finally get is 55.2, lower than the result you provided in your readme.md (58.4).
What's wrong with my reimplementation?

How to use three backbone?

Hi,
How to use three backbones?
Thanks!

Slow inference speed with default config

Hi, there.

I have tested the configs/cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.py config with weights/htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth on my T4 GPU, and the inference time take much longer than I expect. It takes 13600s (e.g. 3.75hours) to do a single inference on the coco minival set, I want to know is this normal?

And I got 58.7AP lower than the report one (59.1AP).

How to perform instance segmentation using CBNetV2?

How to train with validation set, test, and perform inference for instance segmentation using CBNetV2?

CBNet Paper Layers composition

Hi! I was reading the paper and I did not understand a part of the network.
Which is the operation that is performed between the layers of the second backbone network?
Let me better explain: the + inside the circle perform a sum operation? And how these operations are performed?

Thank you.

请问，已训练好的模型，想使用cbnet 应该怎么做？比如说我想使用两个 backbone 三个backbone

论文中说的是，初始化权重，请大神们指点一下
尤其是步骤。

What is actual pytorch, cuda , mmdet,mmcv version used?

Since there are multiple version on which mmdet and mmcv works.
It would be cool if we can recreate exact environment used by this github.

Would you share the environment file?

Segmentation for HTC

Thanks for your work.

I was wondering, whether there are some bugs in your configs.

you are using semantic_roi_extractor and semantic_head in your config for HTC.
https://github.com/VDIGPKU/CBNetV2/blob/main/configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_adamw_20e_coco.py#L24
and, you've used the original '../_base_/datasets/coco_instance.py',, which didn't provide the path of segmentation png.
https://github.com/VDIGPKU/CBNetV2/blob/main/configs/_base_/datasets/coco_instance.py#L37

according to this issue: open-mmlab/mmdetection#3767
with_seg should be True. But once you set this as True, then you should provide some segmentation png path for the training pipeline
so how could you start the training?

About train on custom dataset

I find my train loss is normal, but for eval result, mAP is always 0

Possible bug in mmdet modification?

Thanks for your error report and we appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version.

Describe the bug
In mmdet/models/detectors/two_stage.py line 244, there is this snippet of code -

        if self.with_rpn:
            proposal_cfg = self.train_cfg.get('rpn_proposal',
                                              self.test_cfg.rpn)
            for i,x in enumerate(xs):
                rpn_losses, proposal_list = self.rpn_head.forward_train(
                    x,
                    img_metas,
                    gt_bboxes,
                    gt_labels=None,
                    gt_bboxes_ignore=gt_bboxes_ignore,
                    proposal_cfg=proposal_cfg)
                if len(xs) > 1:
                    rpn_losses = upd_loss(rpn_losses, idx=i, weight=loss_weights[i])
                losses.update(rpn_losses)
        else:
            proposal_list = proposals

        for i,x in enumerate(xs):
            roi_losses = self.roi_head.forward_train(x, img_metas, proposal_list,
                                                    gt_bboxes, gt_labels,
                                                    gt_bboxes_ignore, gt_masks,
                                                    **kwargs)
            if len(xs) > 1:
                roi_losses = upd_loss(roi_losses, idx=i, weight=loss_weights[i])                            
            losses.update(roi_losses)

So when self.with_rpn is True and len(xs) > 1, only the proposal list of the last image will be passed to the roi_head repeatedly. Whereas it should be the proposal list of each image respectively.

Bug fix
Please correct me if I am wrong. I think proposal_list should be defined outside the for loop. As the code stands, it can only be used with batch_size=1

Can pre-trained COCO models be made available?

Since this model is currently cited as SOTA for COCO Minival instance segmentation it would be very helpful if pretrained models would be made available to help further research. Would be very helpful for those of us without the GPU resources to train from scratch. Thank you.

CBSwinTransformer is not in the models registry

When Run the code demo/image_demo.py:

configfile----"../configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.py"

checkpoint',"../htc_cbv2_swin_base22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.pth.zip",

KeyError: "CascadeRCNN: 'CBRes2Net is not in the models registry'"

Hi thank you very much for your great work.
I was trying to train CBNetV2 on my dataset, but this error has created. What should I do then it can be solved?

How to train on a custom dataset

Hi! I've edited the coco.py file, inside dataset, with my custom dataset which works in coco format. I have not understand how I can train the network, in my case I will need the Mask R-CNN, with my custom dataset.
Is there a config file to complete/edit? Or I need to pass all the information with command line?

About inference speed

Hello,

I notice that the FPS of Swin-T, Swin-S, and Swin-B are 7.8, 7.0, and 5.9 in Table 5, respectively. However, there FPS reported in original paper are 15.3, 12.0, and 11.6, respectively. Could you explain what make the difference between these two reported FPS? Thanks.

KeyError: 'CBSwinTransformer is not in the models registry'

During handling of the above exception, another exception occurred....
KeyError: "MaskRCNN: 'CBSwinTransformer is not in the models registry'"

I am assuming it has to do something with mmdet/mmcv installation made for CBNetV2? I encountered a similar problem with another model but solved it by installing mmdet for that particular model. I couldn't figure out what's up with this. Any help?

Thank you in anticipation!

so many bugs, why public...I can't even train existing model

How to work with Custom & Pretrained

Hi
Can you tell @anurag1paul @hachreak @qwe12369 @dave-andersen @DXist
How to work this with custom Dataset .

Secondly,
If i want to use DB SWIN T , will it download the pretrained weigths and load or i have to download weights from some where.

tools/dist_train.sh configs/cbnet/mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_480-800_adamw_3x_coco.py 8 --cfg-options model.pretrained=<PRETRAIN_MODEL>

Can you release the DB-Swin-L of config and model？

thank you for your share!

Reproduce CBNetV2 with other models

Hi！
Recently I want to utilize CBNetV2 in PaddleDetection. After reading your paper, I connect two ResNet50 together directly while find some questions in "Assistant Supervision" part. You have mentioned the detection head and neck are "weight sharing" so that I simply compute loss after forwarding the feats from two backbones to the same neck and head. Then I backward total_loss(=lead_loss+assist_loss) to train entire network.
I want to know if this way is right and is same to your cbnetv2 in mmdetection? If my method exists mistakes, hoping you can give me some advice.
Hoping receive your reply!

got it

Thks again!

How to show the results images?

Hi,

Thanks for your excellent work and codes.

I have reimplemented your whole process on the balloon dataset.

I have a little question. How can I draw the segmentation results on the images? Could you provide an example or an explanation about the variable "outputs" in test.py?

Thanks for your time.

Rainbow

how fast will you release cbnetv2 :>

Does this model take different sizes of images as i/p? Or does all the images have to be of the same WxH?

Inference phase w.out labels

Hi! I need to run inference with the weights of mine trained model.
I was wandering, how can I do it? The only possibility is to run the test.py or test.sh file but it needs a .json file with the annotations that I do not have.
Any suggestion?

Batchsize and lr of HTC+Swin-L

Thank you for your outstanding work, I only have 8 A-100 cards, can I train a batchsize each card and adjust the learning rate to 0.0001 / 2? Is there a negative impact on this relatively large model accuracy of HTC+swin-L?

can you release the DB-Swin-L of training-config file ?

thks

how to use multi-scale test?

Thank you again!,
I use img_scale=[(1600, 1400),(800, 700),(1200, 1050),(2000, 1750),(2400, 2100),(2800, 2450)] , but mAP less than img_scale=(1600, 1400). can you tell me the reason?
@tingtingliangvs

Loss goes nan when training dual swin-base

Hi, I've transferred your code to my own codebase, since your modification of the original mmdetection lies in 3 files (please correct me if I'm wrong):

CBNetV2/mmdet/models/backbones/cbnet.py
CBNetV2/mmdet/models/necks/cbnet_fpn.py
CBNetV2/mmdet/models/detectors/two_stage.py

I directly do a copy-paste to transfer your code my own version of mmdetection, however, loss goes to nan since epoch 17, the reason is that gradient is overflowing, amp loss scaler has to shrink to a small number, until divided by zero, thus it goes to nan.

I use default setting of AMP as yours, I can't figure out what's wrong, could you help me?

FileNotFoundError: [Errno 2] No such file or directory: 'data/coco/train2017/000000256364.png'

How can I change the default image fromate? It seems the default is png. The coco datasets is jpg format.

Reproduce DB-Swin-Large models

Hi, thanks for the great work.

You have released the config for DB-Swin-Large which is marked as test-only. However, it seems like the config can also be used for training. Just want to make sure with you, could I train a detector to achieve 59.6AP on coco-val by implementing this config?

Thanks!

CascadeRCNN: init_weights() got an unexpected keyword argument 'pretrained'

Thanks for your error report and we appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version.

Describe the bug
I was trying to run "cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.py" using the pretrained model "cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.pth" as:

pretrained='./pretrain/cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.pth',
backbone=dict(
    type='CBSwinTransformer',
),

But a bug has appearing:

2021-07-09 06:58:51,846 - mmdet - INFO - load model from: ./pretrain/cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.pth
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
return obj_cls(**args)
File "CBNetV2/mmdet/models/detectors/cascade_rcnn.py", line 25, in init
pretrained=pretrained)
File "CBNetV2/mmdet/models/detectors/two_stage.py", line 48, in init
self.init_weights(pretrained=pretrained)
File "CBNetV2/mmdet/models/detectors/two_stage.py", line 68, in init_weights
self.backbone.init_weights(pretrained=pretrained)
TypeError: init_weights() got an unexpected keyword argument 'pretrained'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/train.py", line 190, in
main()
File "tools/train.py", line 164, in main
test_cfg=cfg.get('test_cfg'))
File "CBNetV2/mmdet/models/builder.py", line 77, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "CBNetV2/mmdet/models/builder.py", line 34, in build
return build_from_cfg(cfg, registry, default_args)
File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
TypeError: CascadeRCNN: init_weights() got an unexpected keyword argument 'pretrained'

How can I solve this problem? Thank you!

Reproduction

What command or script did you run?

A placeholder for the command.

Did you make any modifications on the code or config? Did you understand what you have modified?
What dataset did you use?

Environment

Please run python mmdet/utils/collect_env.py to collect necessary environment information and paste it here.
You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

Is there a way to only detect the classes that I need with the pre-trained model, example car and truck?

Is there a simple way of detecting the classes of choice instead of re-training?

Questions about fused_semantic_head

I tried to train my model as follow
python tools/train.py configs/cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.py

But when it calculated fused_semantic_loss, an error occurred as follow
1only batches of spatial targets supported (3D tensors) but got targets of size: : [1, 100, 148, 3]

Your code is written as : loss_semantic_seg = self.criterion(mask_pred, labels)
I notice you use cross entropy loss in your code but the mask_pred and labels are both BCH*W size. Is this reasonable?
What should I do to solve this problem?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

vdigpku / cbnetv2 Goto Github PK

cbnetv2's People

Contributors

Stargazers

Watchers

Forkers

cbnetv2's Issues

2021-08-03 18:28:25,774 - mmdet - INFO - Environment info:

TorchVision: 0.10.0+cu102 OpenCV: 4.1.2 MMCV: 1.3.9 MMCV Compiler: GCC 7.5 MMCV CUDA Compiler: 11.0 MMDetection: 2.14.0+900f7bd

before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) EvalHook (NORMAL ) NumClassCheckHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook

before_train_iter: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) EvalHook (LOW ) IterTimerHook

after_train_iter: (ABOVE_NORMAL) OptimizerHook (NORMAL ) CheckpointHook (NORMAL ) EvalHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook

after_train_epoch: (NORMAL ) CheckpointHook (NORMAL ) EvalHook (VERY_LOW ) TextLoggerHook

before_val_epoch: (NORMAL ) NumClassCheckHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook

before_val_iter: (LOW ) IterTimerHook

after_val_iter: (LOW ) IterTimerHook

after_val_epoch: (VERY_LOW ) TextLoggerHook

Recommend Projects

Recommend Topics

Recommend Org

TorchVision: 0.10.0+cu102
OpenCV: 4.1.2
MMCV: 1.3.9
MMCV Compiler: GCC 7.5
MMCV CUDA Compiler: 11.0
MMDetection: 2.14.0+900f7bd

before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) EvalHook
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook

after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(VERY_LOW ) TextLoggerHook

before_val_epoch:
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_val_iter:
(LOW ) IterTimerHook

after_val_iter:
(LOW ) IterTimerHook

after_val_epoch:
(VERY_LOW ) TextLoggerHook