Coder Social home page Coder Social logo

cbnetv2's People

Contributors

aemikachow avatar aronlin avatar chrisfsj2051 avatar daavoo avatar erotemic avatar hellock avatar hhaandroid avatar impiga avatar innerlee avatar johnson-wang avatar jshilong avatar melikovk avatar mxbonn avatar myownskyw7 avatar oceanpang avatar rangilyu avatar runningleon avatar ryanxli avatar shinya7y avatar thangvubk avatar tianyuandu avatar tingtingliangvs avatar v-qjqs avatar wangruohui avatar wswday avatar xvjiarui avatar yeliudev avatar yhcao6 avatar yuzhj avatar zwwwayne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cbnetv2's Issues

model loading issues

RuntimeError: [enforce fail at inline_container.cc:108] . file in archive is not in a subdirectory: htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth

roi_head question

Hi, thank you very much for your great work.
I have a question that the htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_adamw_20e.py is inherited from htc_without_semantic_swin_fpn.py that illiterately does not use semantic. If I command out the roi_head to remove the semantic_roi_extractor and semantic_head, does it affect the final accuracy.
My datasets have mask annotations info but I didn't convert them to .pngs, are they necessary?

how to train DB-Swin-L?

if I want to train DB-Swin-L, Do I just need to change the schedule to 20 epoch,just like DB-Swin-B.

Errors while converting model to onnx

Hi,

I am trying to convert the pretrained model into ONNX format and failing to convert it. Please find below the errors.

Exception has occurred: AttributeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
'HybridTaskCascadeRoIHead' object has no attribute 'onnx_export'
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1131, in getattr
type(self).name, name))
File "/CBNetV2/mmdet/models/detectors/two_stage.py", line 201, in onnx_export
return self.roi_head.onnx_export(x, proposals, img_metas)
File "/CBNetV2/mmdet/models/detectors/base.py", line 168, in forward
return self.onnx_export(img[0], img_metas[0])
File "/envs/cbnet_v2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
return old_func(*args, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1039, in _slow_forward
result = self.forward(*input, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/jit/_trace.py", line 132, in forward
self._force_outplace,
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/jit/_trace.py", line 1160, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 373, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 422, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 459, in _model_to_graph
_retain_param_name)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 695, in _export
dynamic_axes=dynamic_axes)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/utils.py", line 94, in export
use_external_data_format=use_external_data_format)
File "/envs/cbnet_v2/lib/python3.7/site-packages/torch/onnx/init.py", line 280, in export
custom_opsets, enable_onnx_checker, use_external_data_format)
File "/CBNetV2/tools/deployment/pytorch2onnx.py", line 78, in pytorch2onnx
dynamic_axes=dynamic_axes)
File "/CBNetV2/tools/deployment/pytorch2onnx.py", line 312, in
dynamic_export=args.dynamic_export)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/envs/cbnet_v2/lib/python3.7/runpy.py", line 193, in _run_module_as_main (Current frame)
"main", mod_spec)

FileNotFoundError: [Errno 2] No such file or directory: 'data/coco/stuffthingmaps/train2017/xyz.png'

I have searched related issues but cannot get the expected help.

I want to train my custom dataset for instance segmentation using Improved HTC with DB-Swin-L as backbone. But I am facing the above error. Since it is an instance segmentation dataset, I don't have stuffthingmaps. Kindly help me as to how should I go about it.

I get the following upon training on Google colab:

2021-08-03 18:28:25,774 - mmdet - INFO - Environment info:

sys.platform: linux
Python: 3.7.11 (default, Jul 3 2021, 18:01:19) [GCC 7.5.0]
CUDA available: True
GPU 0: Tesla T4
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.0_bu.TC445_37.28845127_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0+cu102
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0+cu102
OpenCV: 4.1.2
MMCV: 1.3.9
MMCV Compiler: GCC 7.5
MMCV CUDA Compiler: 11.0
MMDetection: 2.14.0+900f7bd

2021-08-03 18:28:26,338 - mmdet - INFO - Distributed training: False
2021-08-03 18:28:26,893 - mmdet - INFO - Config:
model = dict(
type='HybridTaskCascade',
pretrained=None,
backbone=dict(
type='CBSwinTransformer',
embed_dim=192,
depths=[2, 2, 18, 2],
num_heads=[6, 12, 24, 48],
window_size=7,
mlp_ratio=4.0,
qkv_bias=True,
qk_scale=None,
drop_rate=0.0,
attn_drop_rate=0.0,
drop_path_rate=0.2,
ape=False,
patch_norm=True,
out_indices=(0, 1, 2, 3),
use_checkpoint=False),
neck=dict(
type='CBFPN',
in_channels=[192, 384, 768, 1536],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='HybridTaskCascadeRoIHead',
interleaved=True,
mask_info_flow=True,
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=[
dict(
type='HTCMaskHead',
with_conv_res=False,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
],
semantic_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[8]),
semantic_head=dict(
type='FusedSemanticHead',
num_ins=5,
fusion_level=1,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=183,
ignore_label=255,
loss_weight=0.2)),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)
]),
test_cfg=dict(
rpn=dict(
nms_pre=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.001,
nms=dict(type='soft_nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)))
dataset_type = 'COCODataset'
data_root = 'data/coco/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
dict(
type='Resize',
img_scale=[(1600, 400), (1600, 1400)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1600, 1400),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=1,
workers_per_gpu=1,
train=dict(
type='CocoDataset',
ann_file='data/trainval.json',
img_prefix='data/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
with_seg=True),
dict(
type='Resize',
img_scale=[(1600, 400), (1600, 1400)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'gt_masks',
'gt_semantic_seg'
])
],
seg_prefix='data/coco/stuffthingmaps/train2017/',
classes=('date', 'fig', 'hazelnut')),
val=dict(
type='CocoDataset',
ann_file='data/trainval.json',
img_prefix='data/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1600, 1400),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('date', 'fig', 'hazelnut')),
test=dict(
type='CocoDataset',
ann_file='data/trainval.json',
img_prefix='data/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1600, 1400),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('date', 'fig', 'hazelnut')))
evaluation = dict(metric=['bbox', 'segm'])
optimizer = dict(
type='AdamW',
lr=5e-05,
betas=(0.9, 0.999),
weight_decay=0.05,
paramwise_cfg=dict(
custom_keys=dict(
absolute_pos_embed=dict(decay_mult=0.0),
relative_position_bias_table=dict(decay_mult=0.0),
norm=dict(decay_mult=0.0))))
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = 'htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth'
resume_from = None
workflow = [('train', 1)]
samples_per_gpu = 1
classes = ('date', 'fig', 'hazelnut')
work_dir = './work_dirs/nuts'
gpu_ids = range(0, 1)

/content/CBNetV2/mmdet/core/anchor/builder.py:16: UserWarning: build_anchor_generator would be deprecated soon, please use build_prior_generator
'build_anchor_generator would be deprecated soon, please use '
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2021-08-03 18:28:38,797 - mmdet - INFO - load checkpoint from htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth
2021-08-03 18:28:38,798 - mmdet - INFO - Use load_from_local loader
2021-08-03 18:29:29,361 - mmdet - WARNING - The model and loaded state dict do not match exactly

size mismatch for roi_head.bbox_head.0.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.0.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.bbox_head.1.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.1.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.bbox_head.2.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.2.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.mask_head.0.conv_logits.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for roi_head.mask_head.0.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for roi_head.mask_head.1.conv_logits.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for roi_head.mask_head.1.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for roi_head.mask_head.2.conv_logits.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for roi_head.mask_head.2.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
unexpected key in source state_dict: roi_head.bbox_head.0.shared_convs.0.conv.weight, roi_head.bbox_head.0.shared_convs.0.bn.weight, roi_head.bbox_head.0.shared_convs.0.bn.bias, roi_head.bbox_head.0.shared_convs.0.bn.running_mean, roi_head.bbox_head.0.shared_convs.0.bn.running_var, roi_head.bbox_head.0.shared_convs.0.bn.num_batches_tracked, roi_head.bbox_head.0.shared_convs.1.conv.weight, roi_head.bbox_head.0.shared_convs.1.bn.weight, roi_head.bbox_head.0.shared_convs.1.bn.bias, roi_head.bbox_head.0.shared_convs.1.bn.running_mean, roi_head.bbox_head.0.shared_convs.1.bn.running_var, roi_head.bbox_head.0.shared_convs.1.bn.num_batches_tracked, roi_head.bbox_head.0.shared_convs.2.conv.weight, roi_head.bbox_head.0.shared_convs.2.bn.weight, roi_head.bbox_head.0.shared_convs.2.bn.bias, roi_head.bbox_head.0.shared_convs.2.bn.running_mean, roi_head.bbox_head.0.shared_convs.2.bn.running_var, roi_head.bbox_head.0.shared_convs.2.bn.num_batches_tracked, roi_head.bbox_head.0.shared_convs.3.conv.weight, roi_head.bbox_head.0.shared_convs.3.bn.weight, roi_head.bbox_head.0.shared_convs.3.bn.bias, roi_head.bbox_head.0.shared_convs.3.bn.running_mean, roi_head.bbox_head.0.shared_convs.3.bn.running_var, roi_head.bbox_head.0.shared_convs.3.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.0.conv.weight, roi_head.bbox_head.1.shared_convs.0.bn.weight, roi_head.bbox_head.1.shared_convs.0.bn.bias, roi_head.bbox_head.1.shared_convs.0.bn.running_mean, roi_head.bbox_head.1.shared_convs.0.bn.running_var, roi_head.bbox_head.1.shared_convs.0.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.1.conv.weight, roi_head.bbox_head.1.shared_convs.1.bn.weight, roi_head.bbox_head.1.shared_convs.1.bn.bias, roi_head.bbox_head.1.shared_convs.1.bn.running_mean, roi_head.bbox_head.1.shared_convs.1.bn.running_var, roi_head.bbox_head.1.shared_convs.1.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.2.conv.weight, roi_head.bbox_head.1.shared_convs.2.bn.weight, roi_head.bbox_head.1.shared_convs.2.bn.bias, roi_head.bbox_head.1.shared_convs.2.bn.running_mean, roi_head.bbox_head.1.shared_convs.2.bn.running_var, roi_head.bbox_head.1.shared_convs.2.bn.num_batches_tracked, roi_head.bbox_head.1.shared_convs.3.conv.weight, roi_head.bbox_head.1.shared_convs.3.bn.weight, roi_head.bbox_head.1.shared_convs.3.bn.bias, roi_head.bbox_head.1.shared_convs.3.bn.running_mean, roi_head.bbox_head.1.shared_convs.3.bn.running_var, roi_head.bbox_head.1.shared_convs.3.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.0.conv.weight, roi_head.bbox_head.2.shared_convs.0.bn.weight, roi_head.bbox_head.2.shared_convs.0.bn.bias, roi_head.bbox_head.2.shared_convs.0.bn.running_mean, roi_head.bbox_head.2.shared_convs.0.bn.running_var, roi_head.bbox_head.2.shared_convs.0.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.1.conv.weight, roi_head.bbox_head.2.shared_convs.1.bn.weight, roi_head.bbox_head.2.shared_convs.1.bn.bias, roi_head.bbox_head.2.shared_convs.1.bn.running_mean, roi_head.bbox_head.2.shared_convs.1.bn.running_var, roi_head.bbox_head.2.shared_convs.1.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.2.conv.weight, roi_head.bbox_head.2.shared_convs.2.bn.weight, roi_head.bbox_head.2.shared_convs.2.bn.bias, roi_head.bbox_head.2.shared_convs.2.bn.running_mean, roi_head.bbox_head.2.shared_convs.2.bn.running_var, roi_head.bbox_head.2.shared_convs.2.bn.num_batches_tracked, roi_head.bbox_head.2.shared_convs.3.conv.weight, roi_head.bbox_head.2.shared_convs.3.bn.weight, roi_head.bbox_head.2.shared_convs.3.bn.bias, roi_head.bbox_head.2.shared_convs.3.bn.running_mean, roi_head.bbox_head.2.shared_convs.3.bn.running_var, roi_head.bbox_head.2.shared_convs.3.bn.num_batches_tracked

missing keys in source state_dict: roi_head.bbox_head.0.shared_fcs.1.weight, roi_head.bbox_head.0.shared_fcs.1.bias, roi_head.bbox_head.1.shared_fcs.1.weight, roi_head.bbox_head.1.shared_fcs.1.bias, roi_head.bbox_head.2.shared_fcs.1.weight, roi_head.bbox_head.2.shared_fcs.1.bias

2021-08-03 18:29:29,409 - mmdet - INFO - Start running, host: root@d8f8e57ec13b, work_dir: /content/CBNetV2/work_dirs/nuts
2021-08-03 18:29:29,409 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(VERY_LOW ) TextLoggerHook

before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) EvalHook
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook

after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(VERY_LOW ) TextLoggerHook

before_val_epoch:
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_val_iter:
(LOW ) IterTimerHook

after_val_iter:
(LOW ) IterTimerHook

after_val_epoch:
(VERY_LOW ) TextLoggerHook

2021-08-03 18:29:29,409 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
Traceback (most recent call last):
File "tools/train.py", line 188, in
main()
File "tools/train.py", line 184, in main
meta=meta)
File "/content/CBNetV2/mmdet/apis/train.py", line 185, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/CBNetV2/mmdet/datasets/custom.py", line 194, in getitem
data = self.prepare_train_img(idx)
File "/content/CBNetV2/mmdet/datasets/custom.py", line 217, in prepare_train_img
return self.pipeline(results)
File "/content/CBNetV2/mmdet/datasets/pipelines/compose.py", line 40, in call
data = t(data)
File "/content/CBNetV2/mmdet/datasets/pipelines/loading.py", line 373, in call
results = self._load_semantic_seg(results)
File "/content/CBNetV2/mmdet/datasets/pipelines/loading.py", line 347, in _load_semantic_seg
img_bytes = self.file_client.get(filename)
File "/usr/local/lib/python3.7/dist-packages/mmcv/fileio/file_client.py", line 306, in get
return self.client.get(filepath)
File "/usr/local/lib/python3.7/dist-packages/mmcv/fileio/file_client.py", line 184, in get
with open(filepath, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'data/coco/stuffthingmaps/train2017/11.png'

My config file:
base = '../cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.py'

model = dict(
roi_head=dict(
bbox_head=[
dict(
type='ConvFCBBoxHead',
num_shared_convs=4,
num_shared_fcs=1,
in_channels=256,
conv_out_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
reg_decoded_bbox=True,
norm_cfg=dict(type='SyncBN', requires_grad=True),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=10.0)),
dict(
type='ConvFCBBoxHead',
num_shared_convs=4,
num_shared_fcs=1,
in_channels=256,
conv_out_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
reg_decoded_bbox=True,
norm_cfg=dict(type='SyncBN', requires_grad=True),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=10.0)),
dict(
type='ConvFCBBoxHead',
num_shared_convs=4,
num_shared_fcs=1,
in_channels=256,
conv_out_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
reg_decoded_bbox=True,
norm_cfg=dict(type='SyncBN', requires_grad=True),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=10.0))
]
)
)

model = dict(
type='HybridTaskCascade',
pretrained=None,
roi_head=dict(
type='HybridTaskCascadeRoIHead',
interleaved=True,
mask_info_flow=True,
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=[
dict(
type='HTCMaskHead',
with_conv_res=False,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=3,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
],
))

dataset_type = 'COCODataset'
classes = ('date','fig','hazelnut',)
data = dict(
train=dict(
img_prefix='data/images/',
classes=classes,
ann_file='data/trainval.json'),
val=dict(
img_prefix='data/images/',
classes=classes,
ann_file='data/trainval.json'),
test=dict(
img_prefix='data/images/',
classes=classes,
ann_file='data/trainval.json'))

load_from = 'htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth'

Single GPU training

hi, thanks for sharing your model, is it possible to train this model on custom dataset with single GPU?, whenever i try to do that, getting this error(im using tools/train.py script):
Traceback (most recent call last): File "CBNetV2/tools/train.py", line 188, in <module> main() File "CBNetV2/tools/train.py", line 184, in main meta=meta) File "/content/CBNetV2/mmdet/apis/train.py", line 185, in train_detector runner.run(data_loaders, cfg.workflow) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/parallel/data_parallel.py", line 67, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/content/CBNetV2/mmdet/models/detectors/base.py", line 237, in train_step losses = self(**data) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/fp16_utils.py", line 128, in new_func output = old_func(*new_args, **new_kwargs) File "/content/CBNetV2/mmdet/models/detectors/base.py", line 171, in forward return self.forward_train(img, img_metas, **kwargs) File "/content/CBNetV2/mmdet/models/detectors/two_stage.py", line 266, in forward_train **kwargs) File "/content/CBNetV2/mmdet/models/roi_heads/cascade_roi_head.py", line 248, in forward_train rcnn_train_cfg) File "/content/CBNetV2/mmdet/models/roi_heads/cascade_roi_head.py", line 146, in _bbox_forward_train bbox_results = self._bbox_forward(stage, x, rois) File "/content/CBNetV2/mmdet/models/roi_heads/cascade_roi_head.py", line 136, in _bbox_forward cls_score, bbox_pred = bbox_head(bbox_feats) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/content/CBNetV2/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py", line 155, in forward x = conv(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/cnn/bricks/conv_module.py", line 201, in forward x = self.norm(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/batchnorm.py", line 731, in forward world_size = torch.distributed.get_world_size(process_group) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 748, in get_world_size return _get_group_size(group) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 274, in _get_group_size default_pg = _get_default_group() File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 358, in _get_default_group raise RuntimeError("Default process group has not been initialized, " RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

How to disable semantic mask since I only want to do object detection?

When I simply "python tools/trian.py configs/cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400 1400_giou_4conv1f_adamw_1x_coco.py", I get the error:

Traceback (most recent call last):
  File "tools/train.py", line 188, in <module>
    main()
  File "tools/train.py", line 184, in main
    meta=meta)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/apis/train.py", line 185, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/detectors/base.py", line 237, in train_step
    losses = self(**data)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 128, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/detectors/base.py", line 171, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/detectors/two_stage.py", line 266, in forward_train
    **kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/roi_heads/htc_roi_head.py", line 245, in forward_train
    loss_seg = self.semantic_head.loss(semantic_pred, gt_semantic_seg)
  File "/home/sist/tqzouustc/.conda/envs/CBNet_a100/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/gpfs/home/sist/tqzouustc/code/CBNetV2/mmdet/models/roi_heads/mask_heads/fused_semantic_head.py", line 103, in loss
    labels = labels.squeeze(1).long()
AttributeError: 'NoneType' object has no attribute 'squeeze'

It seems to caused by semantic mask. How could I fix it?

About DB-Swin-B model, I tried to reimplementate it and get 55.2 mAP

I re-trained DB-Swin-B model on coco dataset. The config file I use is configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.py, and the pretrained model I use is swin-base model trained on 22k imagenet, 384 input size. But the mAP I finally get is 55.2, lower than the result you provided in your readme.md (58.4).
What's wrong with my reimplementation?

Slow inference speed with default config

Hi, there.

I have tested the configs/cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.py config with weights/htc_cbv2_swin_large22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.pth on my T4 GPU, and the inference time take much longer than I expect. It takes 13600s (e.g. 3.75hours) to do a single inference on the coco minival set, I want to know is this normal?

And I got 58.7AP lower than the report one (59.1AP).

CBNet Paper Layers composition

Hi! I was reading the paper and I did not understand a part of the network.
Which is the operation that is performed between the layers of the second backbone network?
Let me better explain: the + inside the circle perform a sum operation? And how these operations are performed?
Model Image

Thank you.

Segmentation for HTC

Thanks for your work.

I was wondering, whether there are some bugs in your configs.

  1. you are using semantic_roi_extractor and semantic_head in your config for HTC.
    https://github.com/VDIGPKU/CBNetV2/blob/main/configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_adamw_20e_coco.py#L24
  2. and, you've used the original '../_base_/datasets/coco_instance.py',, which didn't provide the path of segmentation png.
    https://github.com/VDIGPKU/CBNetV2/blob/main/configs/_base_/datasets/coco_instance.py#L37

according to this issue: open-mmlab/mmdetection#3767
with_seg should be True. But once you set this as True, then you should provide some segmentation png path for the training pipeline
so how could you start the training?

Possible bug in mmdet modification?

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
In mmdet/models/detectors/two_stage.py line 244, there is this snippet of code -

        if self.with_rpn:
            proposal_cfg = self.train_cfg.get('rpn_proposal',
                                              self.test_cfg.rpn)
            for i,x in enumerate(xs):
                rpn_losses, proposal_list = self.rpn_head.forward_train(
                    x,
                    img_metas,
                    gt_bboxes,
                    gt_labels=None,
                    gt_bboxes_ignore=gt_bboxes_ignore,
                    proposal_cfg=proposal_cfg)
                if len(xs) > 1:
                    rpn_losses = upd_loss(rpn_losses, idx=i, weight=loss_weights[i])
                losses.update(rpn_losses)
        else:
            proposal_list = proposals

        for i,x in enumerate(xs):
            roi_losses = self.roi_head.forward_train(x, img_metas, proposal_list,
                                                    gt_bboxes, gt_labels,
                                                    gt_bboxes_ignore, gt_masks,
                                                    **kwargs)
            if len(xs) > 1:
                roi_losses = upd_loss(roi_losses, idx=i, weight=loss_weights[i])                            
            losses.update(roi_losses)

So when self.with_rpn is True and len(xs) > 1, only the proposal list of the last image will be passed to the roi_head repeatedly. Whereas it should be the proposal list of each image respectively.

Bug fix
Please correct me if I am wrong. I think proposal_list should be defined outside the for loop. As the code stands, it can only be used with batch_size=1

Can pre-trained COCO models be made available?

Since this model is currently cited as SOTA for COCO Minival instance segmentation it would be very helpful if pretrained models would be made available to help further research. Would be very helpful for those of us without the GPU resources to train from scratch. Thank you.

CBSwinTransformer is not in the models registry

When Run the code demo/image_demo.py:

image

configfile----"../configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.py"

checkpoint',"../htc_cbv2_swin_base22k_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.pth.zip",

How to train on a custom dataset

Hi! I've edited the coco.py file, inside dataset, with my custom dataset which works in coco format. I have not understand how I can train the network, in my case I will need the Mask R-CNN, with my custom dataset.
Is there a config file to complete/edit? Or I need to pass all the information with command line?

About inference speed

Hello,

I notice that the FPS of Swin-T, Swin-S, and Swin-B are 7.8, 7.0, and 5.9 in Table 5, respectively. However, there FPS reported in original paper are 15.3, 12.0, and 11.6, respectively. Could you explain what make the difference between these two reported FPS? Thanks.

KeyError: 'CBSwinTransformer is not in the models registry'

During handling of the above exception, another exception occurred....
KeyError: "MaskRCNN: 'CBSwinTransformer is not in the models registry'"

I am assuming it has to do something with mmdet/mmcv installation made for CBNetV2? I encountered a similar problem with another model but solved it by installing mmdet for that particular model. I couldn't figure out what's up with this. Any help?

Thank you in anticipation!

How to work with Custom & Pretrained

Hi
Can you tell @anurag1paul @hachreak @qwe12369 @dave-andersen @DXist
How to work this with custom Dataset .

Secondly,
If i want to use DB SWIN T , will it download the pretrained weigths and load or i have to download weights from some where.

tools/dist_train.sh configs/cbnet/mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_480-800_adamw_3x_coco.py 8 --cfg-options model.pretrained=<PRETRAIN_MODEL>

Reproduce CBNetV2 with other models

Hi!
Recently I want to utilize CBNetV2 in PaddleDetection. After reading your paper, I connect two ResNet50 together directly while find some questions in "Assistant Supervision" part. You have mentioned the detection head and neck are "weight sharing" so that I simply compute loss after forwarding the feats from two backbones to the same neck and head. Then I backward total_loss(=lead_loss+assist_loss) to train entire network.
I want to know if this way is right and is same to your cbnetv2 in mmdetection? If my method exists mistakes, hoping you can give me some advice.
Hoping receive your reply!

How to show the results images?

Hi,

Thanks for your excellent work and codes.

I have reimplemented your whole process on the balloon dataset.

I have a little question. How can I draw the segmentation results on the images? Could you provide an example or an explanation about the variable "outputs" in test.py?

Thanks for your time.

Rainbow

Inference phase w.out labels

Hi! I need to run inference with the weights of mine trained model.
I was wandering, how can I do it? The only possibility is to run the test.py or test.sh file but it needs a .json file with the annotations that I do not have.
Any suggestion?

Batchsize and lr of HTC+Swin-L

Thank you for your outstanding work, I only have 8 A-100 cards, can I train a batchsize each card and adjust the learning rate to 0.0001 / 2? Is there a negative impact on this relatively large model accuracy of HTC+swin-L?

how to use multi-scale test?

Thank you again!,
I use img_scale=[(1600, 1400),(800, 700),(1200, 1050),(2000, 1750),(2400, 2100),(2800, 2450)] , but mAP less than img_scale=(1600, 1400). can you tell me the reason?
@tingtingliangvs

Loss goes nan when training dual swin-base

Hi, I've transferred your code to my own codebase, since your modification of the original mmdetection lies in 3 files (please correct me if I'm wrong):

  1. CBNetV2/mmdet/models/backbones/cbnet.py
  2. CBNetV2/mmdet/models/necks/cbnet_fpn.py
  3. CBNetV2/mmdet/models/detectors/two_stage.py

I directly do a copy-paste to transfer your code my own version of mmdetection, however, loss goes to nan since epoch 17, the reason is that gradient is overflowing, amp loss scaler has to shrink to a small number, until divided by zero, thus it goes to nan.

I use default setting of AMP as yours, I can't figure out what's wrong, could you help me?

Reproduce DB-Swin-Large models

Hi, thanks for the great work.

You have released the config for DB-Swin-Large which is marked as test-only. However, it seems like the config can also be used for training. Just want to make sure with you, could I train a detector to achieve 59.6AP on coco-val by implementing this config?

Thanks!

CascadeRCNN: init_weights() got an unexpected keyword argument 'pretrained'

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
I was trying to run "cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.py" using the pretrained model "cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.pth" as:

pretrained='./pretrain/cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.pth',
backbone=dict(
    type='CBSwinTransformer',
),

But a bug has appearing:


2021-07-09 06:58:51,846 - mmdet - INFO - load model from: ./pretrain/cascade_mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_400-1400_adamw_3x_coco.pth
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
return obj_cls(**args)
File "CBNetV2/mmdet/models/detectors/cascade_rcnn.py", line 25, in init
pretrained=pretrained)
File "CBNetV2/mmdet/models/detectors/two_stage.py", line 48, in init
self.init_weights(pretrained=pretrained)
File "CBNetV2/mmdet/models/detectors/two_stage.py", line 68, in init_weights
self.backbone.init_weights(pretrained=pretrained)
TypeError: init_weights() got an unexpected keyword argument 'pretrained'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/train.py", line 190, in
main()
File "tools/train.py", line 164, in main
test_cfg=cfg.get('test_cfg'))
File "CBNetV2/mmdet/models/builder.py", line 77, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "CBNetV2/mmdet/models/builder.py", line 34, in build
return build_from_cfg(cfg, registry, default_args)
File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
TypeError: CascadeRCNN: init_weights() got an unexpected keyword argument 'pretrained'


How can I solve this problem? Thank you!

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
  2. What dataset did you use?

Environment

  1. Please run python mmdet/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

Questions about fused_semantic_head

I tried to train my model as follow
python tools/train.py configs/cbnet/htc_cbv2_swin_large_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_1x_coco.py

But when it calculated fused_semantic_loss, an error occurred as follow
1only batches of spatial targets supported (3D tensors) but got targets of size: : [1, 100, 148, 3]

Your code is written as : loss_semantic_seg = self.criterion(mask_pred, labels)
I notice you use cross entropy loss in your code but the mask_pred and labels are both BCH*W size. Is this reasonable?
What should I do to solve this problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.