opengvlab / internimage Goto Github PK

View Code? Open in Web Editor NEW

2.4K 2.4K 229.0 22.89 MB

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Home Page: https://arxiv.org/abs/2211.05778

License: MIT License

Python 62.09% Shell 0.24% C++ 1.20% Cuda 6.39% Jupyter Notebook 30.08%

backbone deformable-convolution foundation-model object-detection semantic-segmentation

internimage's People

Contributors

Stargazers

Watchers

Forkers

wstchhwp leayz-888 deyh2020 ligfo dengjian-cn go-ahead-maker hinsjane linhong00316 marenan miguel-jimenezmartinez ladder123 thanhhung0112 radenbimo nameongithub saeed-anwar kevinmtanadi jackwilson9999 k-h-ismail conscious-choi wenbingwei overbestfitting chenjie04 wofmanaf 985082022 lxt98 seungyoungshin andreimihalea cv-ip ujhwang wstchh ottolu hanzc989 dhkim36 toilaluan dttutty milort bill007bill wusongyuan manzhihuangnian haoyanlong qi-chuan mbyase lattic pbdahzou hongbo-sun lwdebug samson-wang theatm 0x738a zengyijie guichenguang yuanwei0908 sjtuly david-y-e broadswordzhang rentainhe aliman80 josianemouawad brucekyle99 give-hinatahajime-sakuramochi liuxinhai franciszero qiangtang2017 rajpurkarlab dongbo811 mbeoo zeqiang-lai sngn-libby huamiao1012 0x1of1 li-qingyun kchen116 mengxyokok zyc573823770 gavinwang668 zzs4026 ljqcn101 ningz7 h-hui2277 shuowang-ai xyjxjzf vin9196 superf0sh yirui-fafa hjm999999 nobugw 827346462 jinghere11 jdekun rollrollroll kikyowu russ76 fengd dangerous-xu ai-jie01 wangjuenew tomproud ccsvd newbieesaibot joberzheng

internimage's Issues

Is there any simple way to segment a single image?

Thank you for your awesome repo. I am looking for ways to segment a single image. However, it seems codes provide only valuation on the entire ade20k. Is there any way for me to segment a single image. I look into the code but it is quite complicated :D

Mismatch of configuration files in Cityscapes Segmentation

There seems to mismatch of configuration in cityscapes's segmentation folder. I am getting resnetback bone from InternImage XL model and There are some missing keys in the checkpoint file. please check it

Release tensorrt inference code?

Hi ,
Can you release tensorrt inference code for semantic segmentation? If or not support muilt batch_size inference?
Thanks

Maybe better to keep the feature layout without permute ops ?

Use 1x1 conv to replace mlp equivalently ...
Use GroupNorm(1, channels) to relace LayerNorm(channels) equivalently ...
Then the feature layout could be always keeping (b, c, h, w) and need not permute ops to reduce memory usage ?

ATen/OpMathType.h no such file or dictionary.

Trying to compile DCNv3 with PyTorch 1.9.0. and the compiler give me this error. After checking pytorch code in github, it appears that OpMathType.h were added after PyTorch 1.10. But the Readme.md in detection folder says pytorch >= 1.8.0. Or there is a solution for my error? Im not sure.

I don't understand this shared weight in the article, can you explain it in detail, please?

To remedy this problem, we borrow the idea from the separable convolution [56] and detach the original convolution weights wk into depth-wise and point-wise parts, where the depth-wise part is responsible by the original location-aware modulation scalar mk, and the point-wise part is the shared projection weights w among sampling points.

Can you provide some information? Thank you very much!

代码整理得咋样啦

Pure Python implemented DCNv3 ?

Could you provide pure Python/PyTorch implemented DCNv3? It will be easier to migrate and run on a Non-CUDA device.

Encounter error in dcnv3

I try to ensure if model could run and gpu1 is empty，but in forward funcation, error occurs:

output image size in segmentation

Hello,

thanks for the awesome repo! I am trying to adopt this code into a segmentation task for our lab. I needed to strip the model part out and insert it into our script, but I encountered a small issue here.

I started by copying the entire segmentation/mmseg_custom/models/backbones/intern_image.py into the notebook, and initialized the network as shown in your training script with config in upernet_internimage_l_640_160k_ade20k.py

However, when I tried to inference on a (1, 3, 640, 640) image using the below code

model = dict(
    backbone=dict(
        _delete_=True,
        type="InternImage",
        core_op="DCNv3",
        channels=160,
        depths=[5, 5, 22, 5],
        groups=[10, 20, 40, 80],
        mlp_ratio=4.0,
        drop_path_rate=0.4,
        norm_layer="LN",
        layer_scale=1.0,
        offset_scale=2.0,
        post_norm=True,
        with_cp=False,
        out_indices=(0, 1, 2, 3),
        # init_cfg=dict(type="Pretrained", checkpoint=pretrained),
    ),
    decode_head=dict(num_classes=150, in_channels=[160, 320, 640, 1280]),
    auxiliary_head=dict(num_classes=150, in_channels=640),
    test_cfg=dict(mode="whole"),
)
model['type'] = 'InternImage'

net = build_segmentor(model).cuda()
data = torch.rand(1, 3, 640, 640).cuda()
out = net(data)
[each.shape for each in out]

i got these as shape.

[torch.Size([1, 160, 160, 160]),
 torch.Size([1, 320, 80, 80]),
 torch.Size([1, 640, 40, 40]),
 torch.Size([1, 1280, 20, 20])]

I understand this is because of the stem at the begining of the network, and the model is designed to output result at each resolution scale.

my questions is how do i get a segmented result with the same x and y shape? e.g. 640 by 640

i couldnt figure out how to build my own decoder_head as speced in upernet_internimage_l_640_160k_ade20k.py using the mmseg package.

many thanks,
Michael

Using InternImage for Object Detection without Segmentation

Hello,

I hope you are doing well. I am working on a project where I would like to use the InternImage dataset solely for object detection without involving segmentation. I attempted to use it with Cascade RCNN, but I encountered an error during the process.

Here is the error message I received:

2023-03-15 13:39:35,378 - mmdet - INFO - workflow: [('train', 1)], max: 36 epochs
2023-03-15 13:39:35,422 - mmdet - INFO - Checkpoints will be saved to /content/drive/MyDrive/FETP/HealthSit/Phase_02_1/InternImage/detection/work_dirs/mod_cascade_internimage_l_fpn_3x_coco by HardDiskBackend.
Traceback (most recent call last):
  File "/content/drive/MyDrive/FETP/HealthSit/Phase_02_1/InternImage/detection/./train.py", line 247, in <module>
    main()
  File "/content/drive/MyDrive/FETP/HealthSit/Phase_02_1/InternImage/detection/./train.py", line 237, in main
    train_detector(model,
  File "/usr/local/lib/python3.9/dist-packages/mmdet/apis/train.py", line 246, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/usr/local/lib/python3.9/dist-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
    for i, data_batch in enumerate(self.data_loader):
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.9/dist-packages/torch/_utils.py", line 457, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.9/dist-packages/mmdet/datasets/custom.py", line 220, in __getitem__
    data = self.prepare_train_img(idx)
  File "/usr/local/lib/python3.9/dist-packages/mmdet/datasets/custom.py", line 243, in prepare_train_img
    return self.pipeline(results)
  File "/usr/local/lib/python3.9/dist-packages/mmdet/datasets/pipelines/compose.py", line 41, in __call__
    data = t(data)
  File "/usr/local/lib/python3.9/dist-packages/mmdet/datasets/pipelines/loading.py", line 398, in __call__
    results = self._load_masks(results)
  File "/usr/local/lib/python3.9/dist-packages/mmdet/datasets/pipelines/loading.py", line 350, in _load_masks
    [self._poly2mask(mask, h, w) for mask in gt_masks], h, w)
  File "/usr/local/lib/python3.9/dist-packages/mmdet/datasets/pipelines/loading.py", line 350, in <listcomp>
    [self._poly2mask(mask, h, w) for mask in gt_masks], h, w)
  File "/usr/local/lib/python3.9/dist-packages/mmdet/datasets/pipelines/loading.py", line 308, in _poly2mask
    elif isinstance(mask_ann['counts'], list):
TypeError: 'NoneType' object is not subscriptable

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 4737) of binary: /usr/bin/python3
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.9/dist-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/usr/local/lib/python3.9/dist-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/usr/local/lib/python3.9/dist-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/usr/local/lib/python3.9/dist-packages/torch/distributed/run.py", line 715, in run
    elastic_launch(
  File "/usr/local/lib/python3.9/dist-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/usr/local/lib/python3.9/dist-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
./train.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-03-15_13:39:43
  host      : 9677141c6259
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 4737)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================"

From my understanding, it seems that the CascadeRoIHead might require segmentation annotations. I tried using Faster RCNN with InternImage as well but was unsuccessful. I believe that being able to use InternImage for object detection without segmentation could potentially improve performance in certain scenarios.

Could you please provide any guidance or suggestions on how to achieve this? I would really appreciate your help in resolving this issue.

Thank you very much for your time and assistance.

Best regards,
Suppasit Srisaeng

如何体验或测试“图文检索”能力

您好，目前公开的能力中，仅提供了图像检测、分类、分割任务方向。
在介绍中包括了“图文检索”能力，请问如何测试体验？

_pickle.UnpicklingError: invalid load key, '\xda'.

hello, thanks for your reply. Last question had solved. But I meet another problem. When I tried to load pre-train model, I met an error: _pickle.UnpicklingError: invalid load key, '\xda'. It seems that the pre-train model file has broken， isn't it?

Traceback (most recent call last):
File "/home/zyp/下载/pytorch/InternImage-master/segmentation/mytest.py", line 3, in
model = torch.load(r'checkpoint_dir/upernet_internimage_t_512_160k_ade20k.pth')
File "/home/zyp/anaconda3/envs/pytorch_cp37/lib/python3.7/site-packages/torch/serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/zyp/anaconda3/envs/pytorch_cp37/lib/python3.7/site-packages/torch/serialization.py", line 920, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\xda'.

it's my mistake

I can't import DCNv3 in the file dcnv3_func.py,can you tell me how to compile the operator?

DCNV3安装遇到的问题

你好，我这里遇到了以下问题
1.直接使用 sh ./make.sh命令会出现以下错误

2.我查看了make.sh命令发现只是以下的python命令，所以我直接运行了该命令，但是又出现了以下错误。

我的cuda路径确实是这里

从报错的地方看，好像路径的前边多了一个:，但是我对相关代码不熟悉，找不出报错的地方。
不知道有没有好的解决方法
谢谢

ImageNet pretrained cfg not found error

classification train failed, no such file meta_data/train.txt

When train clissification model as doc: it failed.

Traceback (most recent call last):
File "/home/liuzhe/github/InternImage/classification/main.py", line 661, in
main(config)
File "/home/liuzhe/github/InternImage/classification/main.py", line 170, in main
data_loader_val, data_loader_test, mixup_fn = build_loader(config)
File "/home/liuzhe/github/InternImage/classification/dataset/build.py", line 58, in build_loader
dataset_train, config.MODEL.NUM_CLASSES = build_dataset('train',
File "/home/liuzhe/github/InternImage/classification/dataset/build.py", line 158, in build_dataset
dataset = ImageCephDataset(root,
File "/home/liuzhe/github/InternImage/classification/dataset/cached_image_folder.py", line 310, in init
parser = ParserCephImage(root=root,
File "/home/liuzhe/github/InternImage/classification/dataset/cached_image_folder.py", line 383, in init
with open(osp.join(annotation_root, f'{split}.txt'), 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'meta_data/train.txt'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 846478) of binary: /home/miniconda3/envs/lz_ray/bin/python

Won't this create a circular reference?

dcnv3.py need to import DCNv3Function(including DCNv3), while dcnv3_func.py need to import DCNv3. Won't this create a circular reference?

Consulting AI Infra information for InternImage.

Hi, I can't learn about training infrastructure information in this paper. Can you provide the following information?

the number of GPU
the type of GPU
the type of training framework( pytorch, tensorflow, etc)
cost time for full training.

Any chance for the requirements.txt file?

cannot export trt

export onnx is ok, export trt failed as follow:

[03/15/2023-15:00:34] [TRT] [E] ModelImporter.cpp:773: While parsing node number 78 [TRTDCNv3 -> "onnx::MatMul_732"]:
[03/15/2023-15:00:34] [TRT] [E] ModelImporter.cpp:774: --- Begin node ---
[03/15/2023-15:00:34] [TRT] [E] ModelImporter.cpp:775: input: "mmdeploy::TRTDCNv3_685"
input: "mmdeploy::TRTDCNv3_710"
input: "mmdeploy::TRTDCNv3_731"
output: "onnx::MatMul_732"
name: "TRTDCNv3_78"
op_type: "TRTDCNv3"
attribute {
  name: "dilation_h"
  i: 1
  type: INT
}
attribute {
  name: "dilation_w"
  i: 1
  type: INT
}
attribute {
  name: "group_channels"
  i: 16
  type: INT
}
attribute {
  name: "group"
  i: 4
  type: INT
}
attribute {
  name: "im2col_step"
  i: 256
  type: INT
}
attribute {
  name: "kernel_h"
  i: 3
  type: INT
}
attribute {
  name: "kernel_w"
  i: 3
  type: INT
}
attribute {
  name: "offset_scale"
  f: 1
  type: FLOAT
}
attribute {
  name: "pad_h"
  i: 1
  type: INT
}
attribute {
  name: "pad_w"
  i: 1
  type: INT
}
attribute {
  name: "stride_h"
  i: 1
  type: INT
}
attribute {
  name: "stride_w"
  i: 1
  type: INT
}
domain: "mmdeploy"

[03/15/2023-15:00:34] [TRT] [E] ModelImporter.cpp:776: --- End node ---
[03/15/2023-15:00:34] [TRT] [E] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:4870 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
Traceback (most recent call last):
  File "export.py", line 123, in <module>
    main()
  File "export.py", line 118, in main
    onnx2trt(args)
  File "export.py", line 85, in onnx2trt
    max_workspace_size=2**30,
  File "****/mmdeploy/mmdeploy/backend/tensorrt/utils.py", line 177, in from_onnx
    raise RuntimeError(f'Failed to parse onnx, {error_msgs}')
RuntimeError: Failed to parse onnx, In node 78 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

dcn build with python test.py result as follow

foward time cost: 0.04141324281692505
>>> time cost: im2col_step 256; input torch.Size([512, 64, 64, 64]); points 9 
foward time cost: 0.042035584449768064
>>> time cost: im2col_step 512; input torch.Size([512, 64, 64, 64]); points 9 
foward time cost: 0.042629106044769285

mmdeloy 0.13 build from code,. python tools/check_env.py result as follow

2023-03-15 14:57:51,724 - mmdeploy - INFO - 

2023-03-15 14:57:51,725 - mmdeploy - INFO - **********Environmental information**********
2023-03-15 14:57:52,004 - mmdeploy - INFO - sys.platform: linux
2023-03-15 14:57:52,004 - mmdeploy - INFO - Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0]
2023-03-15 14:57:52,004 - mmdeploy - INFO - CUDA available: True
2023-03-15 14:57:52,004 - mmdeploy - INFO - GPU 0: Tesla T4
2023-03-15 14:57:52,004 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2023-03-15 14:57:52,004 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.58
2023-03-15 14:57:52,004 - mmdeploy - INFO - GCC: gcc (GCC) 7.5.0
2023-03-15 14:57:52,004 - mmdeploy - INFO - PyTorch: 1.11.0
2023-03-15 14:57:52,004 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

2023-03-15 14:57:52,004 - mmdeploy - INFO - TorchVision: 0.12.0
2023-03-15 14:57:52,004 - mmdeploy - INFO - OpenCV: 4.5.4
2023-03-15 14:57:52,004 - mmdeploy - INFO - MMCV: 1.5.0
2023-03-15 14:57:52,005 - mmdeploy - INFO - MMCV Compiler: GCC 7.3
2023-03-15 14:57:52,005 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2023-03-15 14:57:52,005 - mmdeploy - INFO - MMDeploy: 0.13.0+02d5a09
2023-03-15 14:57:52,005 - mmdeploy - INFO - 

2023-03-15 14:57:52,005 - mmdeploy - INFO - **********Backend information**********
2023-03-15 14:57:52,065 - mmdeploy - INFO - tensorrt:   8.2.4.2
2023-03-15 14:57:52,065 - mmdeploy - INFO - tensorrt custom ops:        Available
2023-03-15 14:57:52,100 - mmdeploy - INFO - ONNXRuntime:        1.14.1
2023-03-15 14:57:52,100 - mmdeploy - INFO - ONNXRuntime-gpu:    None
2023-03-15 14:57:52,100 - mmdeploy - INFO - ONNXRuntime custom ops:     NotAvailable
2023-03-15 14:57:52,100 - mmdeploy - INFO - pplnn:      None
2023-03-15 14:57:52,101 - mmdeploy - INFO - ncnn:       None
2023-03-15 14:57:52,103 - mmdeploy - INFO - snpe:       None
2023-03-15 14:57:52,104 - mmdeploy - INFO - openvino:   2022.3.0
2023-03-15 14:57:52,105 - mmdeploy - INFO - torchscript:        1.11.0
2023-03-15 14:57:52,105 - mmdeploy - INFO - torchscript custom ops:     NotAvailable
2023-03-15 14:57:52,139 - mmdeploy - INFO - rknn-toolkit:       None
2023-03-15 14:57:52,139 - mmdeploy - INFO - rknn2-toolkit:      None
2023-03-15 14:57:52,140 - mmdeploy - INFO - ascend:     None
2023-03-15 14:57:52,140 - mmdeploy - INFO - coreml:     None
2023-03-15 14:57:52,141 - mmdeploy - INFO - tvm:        None
2023-03-15 14:57:52,141 - mmdeploy - INFO - 

2023-03-15 14:57:52,141 - mmdeploy - INFO - **********Codebase information**********
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmdet:      2.20.0
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmseg:      0.30.0
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmcls:      0.23.0
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmocr:      0.4.1
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmedit:     0.16.1
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmdet3d:    None
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmpose:     0.25.1
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmrotate:   None
2023-03-15 14:57:52,143 - mmdeploy - INFO - mmaction:   None

检测infer的代码发布了吗

When will InternetImage-H and the best COCO detection model be released?

cuda error when running dcnv3

what is the error meaning for?and how can i do to sovle this problem?Thanks！

训练结果与论文展示有区别

尊敬的作者，您好！
我尝试着在学校的服务器上跑了下您的代码，使用的是InternImage-T | Mask R-CNN | 1x，用了两块A100，在预训练模型上跑了12个eopch，配置文件里的参数没有改过，但最后的结果却和论文实验的结果有比较大的出入。
您的结果box mAP 47.2 | mask mAP ：42.5
我这边的输出结果：

请问这是怎么回事呢？

tensorrt and FPS

Hi all, quite impressed with your great work!
I noticed that you put the table of Main Results of FPS, would you provide the code for time measurement?
Also, would you plan to upload the models for edge application? tensorrt, onnx ...

Interest in adapating InternImage for 3D segmentation

Hi, I am very interested in your work! I am wondering if it'd be possible to adapt this backbone for 3D segmentation tasks? Any advice would be great. Thank you!

About hardware

For the InternImage-XL and InternImage-H, how many A100s are you using?
And, how long do you take to complete the FULL pre-training on the large-scale joint dataset (e.g. how long for 8 × A100S or 32 × A100s ) ?

Also, how much RAM and hard drive storage is required to handle such a large dataset?

Thanks

when to release the Intern-H object detection model weights

想复现一些结果但是显存不够

您好，尊敬的作者，我现在有8张3090显卡，但是每个显存只有22G, 发现internimage-H的参数远远大于我一张卡的显存，请问怎么设置参数，才能复现您的结果呢

代码整理得咋样啦

Check point file

I get error could not find checkpoint when run test.py file in detection folder as README.md:
For example, to evaluate the InternImage-T with a single GPU:

python test.py configs/mask_rcnn/mask_rcnn_internimage_t_fpn_1x_coco.py checkpoint_dir/det/mask_rcnn_internimage_t_fpn_1x_coco.pth --eval bbox segm
Error: 
 File "test.py", line 208, in main
    checkpoint = load_checkpoint(model, args.checkpoint, map_location='cpu')
  File "/home/huyen/anaconda3/envs/internimage/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 581, in load_checkpoint
    checkpoint = _load_checkpoint(filename, map_location, logger)
  File "/home/huyen/anaconda3/envs/internimage/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 520, in _load_checkpoint
    return CheckpointLoader.load_checkpoint(filename, map_location, logger)
  File "/home/huyen/anaconda3/envs/internimage/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 285, in load_checkpoint
    return checkpoint_loader(filename, map_location)
  File "/home/huyen/anaconda3/envs/internimage/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 301, in load_from_local
    raise FileNotFoundError(f'{filename} can not be found.')
FileNotFoundError: checkpoint_dir/det/mask_rcnn_internimage_t_fpn_1x_coco.py can not be found.

DCNv3安装遇到问题

求助Windows下安装DCNv3 出现这个问题怎么解决

error: ATen/OpMathType.h: No such file or directory

export.py for segmentation model

Hello! I'm attempting to build an onnx for segmentation inference, and I noticed there's a export.py for the classification folder, but not for segmentation folder...

Is this a possibility in the future? Or, a release of onnx direct?

Thank you!

Coda Availability

Kindly can you tell when will the code be available?

Running code with cpu instead of cuda

I have error install cuda to run code. Could I run it with cpu instead? I found out folder cpu and cuda in folder src in code but not sure how to switch to run code with cpu? Thanks so much for your support.

Occluded object detection

How good the model can be in detecting the occluded objects lying over each other from different categories ?

关于onnx模型导出并用C++/ONNXRuntime部署过程中的问题

我比较喜欢ORT部署，因为可以快捷切换CUDA/Tensorrt/DML/OpenVINO作为推理后端。这是我个人部署过程中出现的一些问题，如果有解决方案最好。

以internimage_t_1k_224的分类模型为例，一开始按照教程导出onnx，会出现大量警告：
WARNING: The shape inference of mmdeploy::TRTDCNv3 type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
不过最终还是导出成功。

导出语句就是README里的：
python export.py --model_name internimage_t_1k_224 --ckpt_dir /path/to/ckpt/dir --onnx

但是用C++/ORT部署时，出现错误：
Fatal error: mmdeploy:TRTDCNv3(-1) is not a registered function/op
这应该是“DCNv3”属于自定义算子，没有用ONNXRuntime Custom operators的C++格式来实现。

根据 issue #41 ，我把"./classification/configs/internimage_t_1k_224.yaml"里的CORE_OP: 'DCNv3'改成CORE_OP: 'DCNv3_pytorch'（我不确信是否可以这样子做😥），使用纯pytorch的DCNv3实现，但是导出过程中还是出现大量警告并且导出失败：

Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Traceback (most recent call last):
  File ".\export.py", line 122, in <module>
    main()
  File ".\export.py", line 113, in main
    torch2onnx(args, cfg)
  File ".\export.py", line 61, in torch2onnx
    torch.onnx.export(model,
  File "D:\Python\Python38\lib\site-packages\torch\onnx\__init__.py", line 350, in export
    return utils.export(
  File "D:\Python\Python38\lib\site-packages\torch\onnx\utils.py", line 163, in export
    _export(
  File "D:\Python\Python38\lib\site-packages\torch\onnx\utils.py", line 1110, in _export
    ) = graph._export_onnx(  # type: ignore[attr-defined]
RuntimeError: Could not allocate bytes object!

About segmentation inference

Thanks for your great work! Is the inference code on segmentation task available? Thanks!

[Error] Inference with onnxruntime

Hi, thanks for sharing this excellent works. I'm trying to use InterImage in onnx format. When i export the model from pytorch to onnx, warnings WARNING: The shape inference of mmdeploy::TRTDCNv3 type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. appear.

Besides, when i inference with the exported onnx by onnxruntime, error onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from /data/123/internimage_t_1k_224.onnx failed:Fatal error: mmdeploy:TRTDCNv3(-1) is not a registered function/op occurs . Can you give me some advice for solving this problem?

eval

Hello, I use internimage as my model backone. Why does the training loss drop normally, but the map in the verification set is all 0

The dcn_v3 module cannot be installed

Hi! First of all thank you for your work. The following error occurs when I compile the CUDA operator：

Looking forward to your reply！Thanks!

`CUBLAS_STATUS_INTERNAL_ERROR` on training segmentation

Hi there, firstly thank you very much for your work. Upon trying to use your backbone to train a segmentation model, I run into a CUBLAS_STATUS_INTERNAL_ERROR:

2023-03-10 22:05:40,534 - mmseg - INFO - workflow: [('train', 1)], max: 160000 iters
2023-03-10 22:05:40,534 - mmseg - INFO - Checkpoints will be saved to mmsegmentation/work_dirs/internimage_base_512 by HardDiskBackend.
2023-03-10 22:05:46,860 - mmseg - INFO - Iter [20/160000]       lr: 7.600e-07, eta: 13:43:03, time: 0.309, data_time: 0.014, memory: 6998, decode.loss_ce: nan, decode.acc_seg: 7.1505, aux.loss_ce: nan, aux.acc_seg: 7.1649, loss: nan
Traceback (most recent call last):
  File "mmsegmentation/train.py", line 162, in <module>
    train_segmentor(model, datasets, cfg, distributed=False, validate=True, 
  File "mmsegmentation/mmseg/apis/train.py", line 194, in train_segmentor
    runner.run(data_loaders, cfg.workflow)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 138, in run
    iter_runner(iter_loaders[i], **kwargs)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 62, in train
    outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "mmsegmentation/mmseg/models/segmentors/base.py", line 138, in train_step
    losses = self(**data_batch)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/mmcv/runner/fp16_utils.py", line 116, in new_func
    return old_func(*args, **kwargs)
  File "mmsegmentation/mmseg/models/segmentors/base.py", line 108, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 139, in forward_train
    x = self.extract_feat(img)
  File "mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 65, in extract_feat
    x = self.backbone(img)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mmsegmentation/mmseg/models/backbones/intern_image.py", line 479, in forward
    x, x_ = level(x, return_wo_downsample=True)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mmsegmentation/mmseg/models/backbones/intern_image.py", line 316, in forward
    x = blk(x)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mmsegmentation/mmseg/models/backbones/intern_image.py", line 252, in forward
    x = _inner_forward(x)
  File "mmsegmentation/mmseg/models/backbones/intern_image.py", line 242, in _inner_forward
    x = x + self.drop_path(self.gamma1 * self.norm1(self.dcn(x)))
  File ".conda/envs/mmlab/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mmsegmentation/ops_dcnv3/modules/dcnv3.py", line 276, in forward
    x = self.output_proj(x)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File ".conda/envs/mmlab/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

I compiled DCNv3 and test.py runs without an error.
CUBLAS_STATUS_INTERNAL_ERROR does not occur with other native mmsegmentation configs/backbones.

Do you know what could be the cause of this issue?
Thank you very much!

CUDA 11.3
pytorch 11.1.0
cudnn8.2.0
torchvision0.12.0

All conda packages:

# Name                    Version                   Build  Channel
addict                    2.4.0                    pypi_0    pypi
blas                      1.0                         mkl  
brotlipy                  0.7.0           py39h27cfd23_1003  
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2022.4.26            h06a4308_0  
certifi                   2022.6.15        py39h06a4308_0  
cffi                      1.15.0           py39hd667e15_1  
charset-normalizer        2.0.12                   pypi_0    pypi
click                     7.1.2                    pypi_0    pypi
colorama                  0.4.5                    pypi_0    pypi
cryptography              37.0.1           py39h9ce1e76_0  
cudatoolkit               11.3.1               h2bc3f7f_2  
cycler                    0.11.0                   pypi_0    pypi
dcnv3                     1.0                      pypi_0    pypi
ffmpeg                    4.3                  hf484d3e_0    pytorch
filelock                  3.9.0                    pypi_0    pypi
fonttools                 4.33.3                   pypi_0    pypi
freetype                  2.11.0               h70c0345_0  
giflib                    5.2.1                h7b6447c_0  
gmp                       6.2.1                h295c915_3  
gnutls                    3.6.15               he1e5248_0  
huggingface-hub           0.13.1                   pypi_0    pypi
idna                      3.3                pyhd3eb1b0_0  
importlib-metadata        4.11.4                   pypi_0    pypi
intel-openmp              2021.4.0          h06a4308_3561  
jpeg                      9e                   h7f8727e_0  
kiwisolver                1.4.3                    pypi_0    pypi
lame                      3.100                h7b6447c_0  
lcms2                     2.12                 h3be6417_0  
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 11.2.0               h1234567_1  
libiconv                  1.16                 h7f8727e_2  
libidn2                   2.3.2                h7f8727e_0  
libpng                    1.6.37               hbc83047_0  
libstdcxx-ng              11.2.0               h1234567_1  
libtasn1                  4.16.0               h27cfd23_0  
libtiff                   4.2.0                h2818925_1  
libunistring              0.9.10               h27cfd23_0  
libuv                     1.40.0               h7b6447c_0  
libwebp                   1.2.2                h55f646e_0  
libwebp-base              1.2.2                h7f8727e_0  
lz4-c                     1.9.3                h295c915_1  
markdown                  3.3.7                    pypi_0    pypi
matplotlib                3.5.2                    pypi_0    pypi
mkl                       2021.4.0           h06a4308_640  
mkl-service               2.4.0            py39h7f8727e_0  
mkl_fft                   1.3.1            py39hd3c417c_0  
mkl_random                1.2.2            py39h51133e4_0  
mmcls                     0.23.1                   pypi_0    pypi
mmcv-full                 1.5.3                    pypi_0    pypi
mmdet                     2.28.1                   pypi_0    pypi
mmsegmentation            0.25.0                    dev_0    <develop>
model-index               0.1.11                   pypi_0    pypi
ncurses                   6.3                  h7f8727e_2  
nettle                    3.7.3                hbbd107a_1  
numpy                     1.23.0                   pypi_0    pypi
numpy-base                1.22.3           py39hf524024_0  
opencv-python             4.6.0.66                 pypi_0    pypi
openh264                  2.1.1                h4ff587b_0  
openmim                   0.1.6                    pypi_0    pypi
openssl                   1.1.1o               h7f8727e_0  
ordered-set               4.1.0                    pypi_0    pypi
packaging                 21.3                     pypi_0    pypi
pandas                    1.4.3                    pypi_0    pypi
pillow                    9.1.1                    pypi_0    pypi
pip                       21.2.4           py39h06a4308_0  
prettytable               3.3.0                    pypi_0    pypi
pycocotools               2.0.6                    pypi_0    pypi
pycparser                 2.21               pyhd3eb1b0_0  
pyopenssl                 22.0.0             pyhd3eb1b0_0  
pyparsing                 3.0.9                    pypi_0    pypi
pysocks                   1.7.1            py39h06a4308_0  
python                    3.9.12               h12debd9_1  
python-dateutil           2.8.2                    pypi_0    pypi
pytorch                   1.11.0          py3.9_cuda11.3_cudnn8.2.0_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pytz                      2022.1                   pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
readline                  8.1.2                h7f8727e_1  
requests                  2.28.0                   pypi_0    pypi
scipy                     1.10.1                   pypi_0    pypi
setuptools                61.2.0           py39h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
sqlite                    3.38.5               hc218d9a_0  
tabulate                  0.8.10                   pypi_0    pypi
termcolor                 2.2.0                    pypi_0    pypi
terminaltables            3.1.10                   pypi_0    pypi
timm                      0.6.11                   pypi_0    pypi
tk                        8.6.12               h1ccaba5_0  
torchaudio                0.11.0               py39_cu113    pytorch
torchvision               0.12.0               py39_cu113    pytorch
tqdm                      4.65.0                   pypi_0    pypi
typing-extensions         4.2.0                    pypi_0    pypi
typing_extensions         4.1.1              pyh06a4308_0  
tzdata                    2022a                hda174b7_0  
urllib3                   1.26.9           py39h06a4308_0  
wcwidth                   0.2.5                    pypi_0    pypi
wheel                     0.37.1             pyhd3eb1b0_0  
xz                        5.2.5                h7f8727e_1  
yacs                      0.1.8                    pypi_0    pypi
yapf                      0.32.0                   pypi_0    pypi
zipp                      3.8.0                    pypi_0    pypi
zlib                      1.2.12               h7f8727e_2  
zstd                      1.5.2                ha4553b6_0

potential error of segmentation config (base model) ?

Hi Authors,

Thanks for your great work! I noticed that the num_layers in the config of base model upernet_internimage_b_512_160k_ade20k.py is inconsistent with the depths in upernet_internimage_b_512_160k_ade20k.py. However, these two parameters are consistent in other variants (e.g., tiny, small, large). Is this a specific configuration of the base model or is it human error？

你好，代码啥时候公开？

无法运行问题

我在windows平台尝试调试segmentation里的train.py, 但是遇到了如下问题

我单步调试后发现当尝试import mmcv时会触发该bug, 但是我已经按照readme安装了该库
我在错误处尝试换gbk编码解释该字符串, 能成功解析

我在全盘搜索了该字符串, 发现它位于安装目录Python\Python39\Lib\site-packages\torch\utils下的cpp_extension.py中

我不太清楚是否是环境的问题, 以及是否能在windows上修复该问题
谢谢

TypeError: forward() missing 1 required positional argument: 'im2col_step'

from future import absolute_import
from future import print_function
from future import division

import torch
import torch.nn.functional as F
from torch.autograd import Function
from torch.autograd.function import once_differentiable
from torch.cuda.amp import custom_bwd, custom_fwd
from ops_dcnv3.modules import dcnv3

class DCNv3Function(Function):
@staticmethod
@custom_fwd
def forward(
ctx, input, offset, mask,
kernel_h, kernel_w, stride_h, stride_w,
pad_h, pad_w, dilation_h, dilation_w,
group, group_channels, offset_scale, im2col_step):
ctx.kernel_h = kernel_h
ctx.kernel_w = kernel_w
ctx.stride_h = stride_h
ctx.stride_w = stride_w
ctx.pad_h = pad_h
ctx.pad_w = pad_w
ctx.dilation_h = dilation_h
ctx.dilation_w = dilation_w
ctx.group = group
ctx.group_channels = group_channels
ctx.offset_scale = offset_scale
ctx.im2col_step = im2col_step
output = DCNv3Function.forward(
input, offset, mask, kernel_h, kernel_w, stride_h, stride_w,
pad_h, pad_w, dilation_h, dilation_w,
group, group_channels, offset_scale, im2col_step)
ctx.save_for_backward(input, offset, mask)

    return output

No module named 'DCNv3'

ModuleNotFoundError: No module named 'DCNv3'

How to use the pre-trained or self trained detection model on CoCo image?

Is there a example for using the dectection model?
Eg. how to encapsulate it as a function ?