MaxViT (PyTorch version)

This repo contains the unofficial PyTorch-version MaxViT model, training, and validation codes. This repo is written to share the PyTorch-version training hyper-parameters of MaxViT. For this, we just copy-and-paste the training hyper-parameters shown in table 12 of the original paper with the modification of the number of GPUs (we use 4 GPUs). Since most codes including model, train, and valid are copy-pasted from Timm github, the credits should be given to @rwightman and the original authors. See also their repos:

Tutorial

Test environments: torch==1.11.0 & timm==0.9.2

Clone this repo

git clone https://github.com/hankyul2/maxvit-pytorch
cd maxvit-pytorch

Run the following command to train MaxViT-T in imagenet-1k dataset. For model variants, just change the --drop-path to 0.3 (small) and 0.4 (base). For training with 4 GPUs, we use the gradient accumulation of 16 = 4096 (paper total batch) / 256 (our total batch).

Training time: about 5 days for the maxvit_tiny_tf_224 model with 4 GPUs (RTX 3090, 24GB).

torchrun --nproc_per_node=4 --master_port=12345 train.py /path/to/imagenet --model maxvit_tiny_tf_224 --aa rand-m15-mstd0.5-inc1 --mixup .8 --cutmix 1.0 --remode pixel --reprob 0.25 --drop-path .2 --opt adamw --weight-decay .05 --sched cosine --epochs 300 --lr 3e-3 --warmup-lr 1e-6 --warmup-epoch 30 --min-lr 1e-5 -b 64 -tb 4096 --smoothing 0.1 --clip-grad 1.0 -j 8 --amp --pin-mem --channels-last

Run the following command to reproduce the validation results of MaxViT-T in the imagenet-1k dataset.

Results: ** Acc@1 83.820 (16.180) Acc@5 96.528 (3.472)*
```
python3 valid.py /path/to/imagenet --img-size 224 --crop-pct 0.95 --cuda 0 --model maxvit_tiny_tf_224 --pretrained
```

Experiment result

Model	Image size	#Param	FLOPs	Top1	Artifacts
MaxViT-T (paper)	224	31M	5.6G	83.62
MaxViT-T (ours)	224	31M	5.6G	83.82	[yaml], [ckpt], [log], [csv]

References

@inproceedings{tu2022maxvit,
  title={Maxvit: Multi-axis vision transformer},
  author={Tu, Zhengzhong and Talebi, Hossein and Zhang, Han and Yang, Feng and Milanfar, Peyman and Bovik, Alan and Li, Yinxiao},
  booktitle={European conference on computer vision},
  pages={459--479},
  year={2022},
  organization={Springer}
}

hankyul2 / maxvit-pytorch Goto Github PK

maxvit-pytorch's Introduction

MaxViT (PyTorch version)

Tutorial

Experiment result

References

maxvit-pytorch's People

Stargazers

Watchers

maxvit-pytorch's Issues

top1_epoch_graph

how to use this to do object detectoin

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent