megvii-research / revcol Goto Github PK

View Code? Open in Web Editor NEW

245.0 12.0 10.0 554 KB

Official Code of Paper "Reversible Column Networks" "RevColv2"

License: Apache License 2.0

Python 99.70% Shell 0.30%

cnn computer-vision pytorch transformer iclr2023 mae vit

revcol's Introduction

Reversible Column Networks

This repo is the official implementation of:

Reversible Column Networks

Yuxuan Cai, Yizhuang Zhou, Qi Han, Jianjian Sun, Xiangwen Kong, Jun Li, Xiangyu Zhang
MEGVII Technology
International Conference on Learning Representations (ICLR) 2023
[arxiv]

RevColV2: Exploring Disentangled Representations in Masked Image Modeling

Qi Han, Yuxuan Cai, Xiangyu Zhang
MEGVII Technology
[arxiv]

Updates

9/06/2023*
RevColv2 will be released soon!

3/15/2023*
RevCol Huge checkpoint for segmentation released! Add visualization tools.

3/9/2023*
Detection, Segmentation Code and Model Weights Released.

2/10/2023
RevCol model weights released.

1/21/2023
RevCol was accepted by ICLR 2023!

12/23/2022
Initial commits: codes for ImageNet-1k and ImageNet-22k classification are released.

To Do List

ImageNet-1K and 22k Training Code
ImageNet-1K and 22k Model Weights
Cascade Mask R-CNN COCO Object Detection Code & Model Weights
ADE20k Semantic Segmentation Code & Model Weights

Introduction

RevCol is composed of multiple copies of subnetworks, named columns respectively, between which multi-level reversible connections are employed. RevCol coud serves as a foundation model backbone for various tasks in computer vision including classification, detection and segmentation.

Main Results on ImageNet with Pre-trained Models

name	pretrain	resolution	#params	FLOPs	acc@1	pretrained model	finetuned model
RevCol-T	ImageNet-1K	224x224	30M	4.5G	82.2	baidu/github	-
RevCol-S	ImageNet-1K	224x224	60M	9.0G	83.5	baidu/github	-
RevCol-B	ImageNet-1K	224x224	138M	16.6G	84.1	baidu/github	-
RevCol-B^*	ImageNet-22K	224x224	138M	16.6G	85.6	baidu/github	baidu/github
RevCol-B^*	ImageNet-22K	384x384	138M	48.9G	86.7	baidu/github	baidu/github
RevCol-L^*	ImageNet-22K	224x224	273M	39G	86.6	baidu/github	baidu/github
RevCol-L^*	ImageNet-22K	384x384	273M	116G	87.6	baidu/github	baidu/github
RevCol-H^*+	Megdata-168M	pretrain 224 / finetune 640	2.1B	2537	90.0	huggingface	huggingface

[+]: Note that we use a slightly different model on RevCol-H with one more branch from the bottom level to the top one. Later experiments prove that this connection is unnecessary, however, consider RevCol-H's training cost, we do not retrain it.

Getting Started

Please refer to INSTRUCTIONS.md for setting up, training and evaluation details.

Acknowledgement

This repo was inspired by several open source projects. We are grateful for these excellent projects and list them as follows:

License

RevCol is released under the Apache 2.0 license.

Contact Us

If you have any questions about this repo or the original paper, please contact Yuxuan at [email protected].

Citation

@inproceedings{cai2022reversible,
  title={Reversible Column Networks},
  author={Cai, Yuxuan and Zhou, Yizhuang and Han, Qi and Sun, Jianjian and Kong, Xiangwen and Li, Jun and Zhang, Xiangyu},
  booktitle={International Conference on Learning Representations},
  year={2023},
  url={https://openreview.net/forum?id=Oc2vlWU0jFY}
}

revcol's People

Contributors

Stargazers

Watchers

Forkers

mldl dl-cnn gazelei jacktesla onepiec1 zhengfd123 sunmingyang1987 fourthm ssmae99 rottk

revcol's Issues

对于语义分割部分的疑问

作者你好，我拜读了您的论文，在语义分割部分是有multi scale的实验的，但是问题在于如果图片尺寸输入是某些奇数，比如说229或者223，那么上采样和下采样的部分就会不一致x = self.up(c_up) + self.down(c_down），我反复翻看了数据处理部分（mmsegmentaion中ade的config文件），并没有找到对这一错误的处理，于是想请教一下作者是如何处理这一部分的。

TypeError: main() takes 1 positional argument but 3 were given

When I run the following command :

python main.py  --cfg configs/revcol_base_1k_384_finetune.yaml --batch-size 4 \
    --data-path ../data/classification --finetune revcol_base_22k.pth

It outputs this:

=> merge config from configs/revcol_base_1k_384_finetune.yaml
Traceback (most recent call last):
  File "main.py", line 422, in <module>
    main(None, config, ngpus_per_node)
TypeError: main() takes 1 positional argument but 3 were given

When will COCO model be released?

Hi! Thanks for the great codebase!

Do you have any timeframe for the release of the COCO object detection model?

code issue

https://github.com/megvii-research/RevCol/blob/2c36531d52b985dee43f68d0c5a9735a9cba5bf5/models/modules.py#LL27C1-L27C38

    x = x = x.permute(0, 3, 1, 2)

I do not know why ?

How to export onnx model in save_memory=True?

We are trying to convert Revcol to TensorRT format, but when converting to ONNX, we found that when using save_memory=True, the conversion does not work properly.
Here is our conversion test code:

import torch
from models.revcol import *
model = revcol_tiny(save_memory=True, inter_supv=False, drop_path = 0.1, num_classes=10, kernel_size = 3)

for i in range(model.num_subnet):
    getattr(model, f'subnet{str(i)}').save_memory = False

x = torch.zeros(1, 3, 224, 224)
torch.onnx.export(model, x, './weights/revcol_tiny.onnx', verbose=False, opset_version=17,
                        training=torch.onnx.TrainingMode.EVAL,
                        do_constant_folding=True,
                        input_names=['images'],
                        output_names=['output'],
                        dynamic_axes=None)

When save_memory=True, the following error occurs：

File [d:\SoftWare\anaconda3\envs\torch\lib\site-packages\torch\onnx\utils.py:506](file:///D:/SoftWare/anaconda3/envs/torch/lib/site-packages/torch/onnx/utils.py:506), in export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, opset_version, do_constant_folding, dynamic_axes, keep_initializers_as_inputs, custom_opsets, export_modules_as_functions)
    188 @_beartype.beartype
    189 def export(
    190     model: Union[torch.nn.Module, torch.jit.ScriptModule, torch.jit.ScriptFunction],
   (...)
    206     export_modules_as_functions: Union[bool, Collection[Type[torch.nn.Module]]] = False,
    207 ) -> None:
    208     r"""Exports a model into ONNX format.
    209 
    210     If ``model`` is not a :class:`torch.jit.ScriptModule` nor a
   (...)
    503             All errors are subclasses of :class:`errors.OnnxExporterError`.
...
    511         '(vmap, grad, jvp, jacrev, ...), it must override the setup_context '
    512         'staticmethod. For more details, please see '
    513         'https://pytorch.org/docs/master/notes/extending.func.html')

RuntimeError: invalid unordered_map<K, T> key

If you add the following code, the export will work, but you should not be able to take advantage of the low memory footprint of Reversible Net.

for i in range(model.num_subnet):
    getattr(model, f'subnet{str(i)}').save_memory = False

Is there any relevant solution?

你好，请问模型架构中的STEM的全称是什么？

你好，请问模型架构中的STEM的全称是什么，是什么意思？谢谢

Best checkpoint saving problem

At L204 in main.py , only when the training hit the SAVE_FREQ will it save the checkpoint and check for whether to save the best checkpoint. The problem is, however, if the model happen to obtain the best accuracy outside the SAVE_FREQ epoch then the best weight will not be saved.

Since most of our team members run the code on a relatively small datasets, we tent to set a large value on SAVE_FREQ to save some hard drive space. This lead to the situation that the training always miss the chance to save the best.pth.

Is this a intended behavior ? Can you please add a arg to control whether it will save the best.pth every time the model obtain the highest accuracy even if it's not on the SAVE_FREQ epoch ?

Code issue

Huggingface

Thank you so much great work.

I would like to use the RevCol segmentation model with Huggingface.
However, I could not figure out how to use it.

I tried to load the pretraiend model based on the following URL, following the general Huggingface usage, but it did not work.
https://huggingface.co/LarryTsai/RevCol

I would appreciate any help you could give me.
Thank you in advance.

The class “Fusion”

The source code reads:
self.down = nn.Sequential(
nn.Conv2d(channels[level-1], channels[level], kernel_size=2, stride=2),
LayerNorm(channels[level], eps=1e-6, data_format="channels_first"),
) if level in [1, 2, 3] else nn.Identity()
So when level=0 and first_col=True, how do we implement downsampling?

Could you release the large model with DINO which was pre-trained on o365?

Hi, I saw the result from your original paper, announcing the mAP of 63.8% on COCO minival set. Could you please release that model in this codebase? Thanks a lot!

Any latency information to be reported?

Thank you for your impressive work Tsai! I am wondering whether there are any latency comparisons against other convnet/transformer models? Since the network is built by efficient 3x3 convolution and linear operators, it is expected to have better throughputs.

Hello, Could you provide the source code of the DINO model? Thank you so much.

When will RevColv2 be released

When will RevColv2 be released ?

Why not train on Cityscapes?

Hi! Why not train on Cityscapes? And will you train RevCol on Cityscapes dataset?

'tools/dist_train.sh'

First of all, I would say I think this is great work and I am very very interested. Then, I want to debug the segmentation task in the repository. But I didn't find the 'tools/dist_train.sh' when I followed the README.md. Any help will be appreciated.

How to solve package [teREVCOLolor] missing message , thanks

Thanks for all your great work.

We've some error-message during Installation like below , and please give us more advice if any , thanks,
'''
ERROR: Could not find a version that satisfies the requirement teREVCOLolor==1.1.0 (from versions: none)
ERROR: No matching distribution found for teREVCOLolor==1.1.0
'''

size mismatch testing revcol_tiny

Hi, congrats on the paper!

I tried testing the code but I get the following error running the below test code:

RuntimeError: The size of tensor a (4) must match the size of tensor b (5) at non-singleton dimension 3

from models.revcol import revcol_tiny
import torch
nt = revcol_tiny(True, num_classes=10)
a = torch.ones([1,3,40,40])
out = nt.forward(a)
print(out.shape)

megvii-research / revcol Goto Github PK

revcol's Introduction

Reversible Column Networks

Updates

To Do List

Introduction

Main Results on ImageNet with Pre-trained Models

Getting Started

Acknowledgement

License

Contact Us

Citation

revcol's People

Contributors

Stargazers

Watchers

Forkers

revcol's Issues

Recommend Projects

Recommend Topics

Recommend Org