Mona

The official implementation of "Adapter is All You Need for Tuning Visual Tasks".

Introduction
Main Results
Getting Started
Citation
Acknowledgement

Introduction

Pre-training & ﬁne-tuning can enhance the transferring efﬁciency and performance in visual tasks. Recent deltatuning methods provide more options for visual classiﬁcation tasks. Despite their success, existing visual delta-tuning art fails to exceed the upper limit of full ﬁne-tuning on challenging tasks like instance segmentation and semantic segmentation. To ﬁnd a competitive alternative to full ﬁne-tuning, we propose the Multi-cognitive Visual Adapter (Mona) tuning, a novel adapter-based tuning method.

Mona achieves the strong performance on COCO object detection (53.4 box AP and 46.0 mask AP on test-dev with Swin-Base), and ADE20K semantic segmentation (51.36 mIoU on val with Swin-Large).

Main Results

The proposed Mona outperforms full ﬁne-tuning on representative visual tasks, which promotes the upper limit of previous delta-tuning art. The results demonstrate that the adapter-tuning paradigm can replace full ﬁne-tuning and achieve better performance in most visual tasks. Full ﬁne-tuning may no longer be the only preferred solution for transfer learning in the future.

Note:

We report the results with Cascade Mask R-CNN (Swin-Base) and UperNet (Swin-Large) framework for COCO and ADE20K respectively.
The pre-trained weights are IM22K-Supervied pre-trained Swin-Base and Swin-Large.

Moreover, Mona converges faster than other tested delta-tuning arts.

Note:

We obtain the loss on VOC dataset with RetinaNet equipped with Swin-Large.

Getting Started

Object Detection & Instance Segmentation

Installation

Please refer to Swin-Transformer-Object-Detection for the environments and dataset preparation.

Training Mona

After organizing the dataset, you have to modify the config file according to your environments.

data_root, have to be set as the actual dataset path.
load_from, should be set to your pre-trained weight path.
norm_cfg, have to be set to SyncBN if you train the model with multi-gpus.

Please execute the following command in the project path.

COCO

bash Swin-Transformer-Object-Detection/tools/dist_train.sh Swin-Transformer-Object-Detection/mona_configs/swin-b_coco/cascade_mask_swin_base_3x_coco_sample_1_bs_16_mona.py `Your GPUs`

VOC

bash Swin-Transformer-Object-Detection/tools/dist_train.sh Swin-Transformer-Object-Detection/mona_configs/swin-l_voc/voc_retinanet_swin_large_1x_mona.py `Your GPUs`

Semantic Segmentation

Installation

Please refer to Swin-Transformer-Semantic-Segmentation for the environments and dataset preparation.

Training Mona

Follow the guidance in Object Detection & Instance Segmentation to check your config file.

Please execute the following command in the project path.

ADE20K

bash Swin-Transformer-Semantic-Segmentation/tools/dist_train.sh Swin-Transformer-Semantic-Segmentation/mona_configs/swin-l_ade20k/ade20k_upernet_swin_large_160k_mona.py `Your GPUs`

Classification

Installation

Please refer to Swin-Transformer-Classification for environments. the environments.

Note:

We reorganize the dataset format to match the requirements of mmclassification.
You can follow the following format:

mmclassification
└── data
    └── my_dataset
        ├── meta
        │   ├── train.txt
        │   ├── val.txt
        │   └── test.txt
        ├── train
        ├── val
        └── test

Training Mona

Follow the guidance in Object Detection & Instance Segmentation to check your config file.

Please execute the following command in the project path.

Oxford Flower

bash Swin-Transformer-Classification/tools/dist_train.sh Swin-Transformer-Classification/mona_configs/swin-l_oxford-flower/swin-large_4xb8_oxford_flower_mona.py `Your GPUs`

Oxford Pet

bash Swin-Transformer-Classification/tools/dist_train.sh Swin-Transformer-Classification/mona_configs/swin-l_oxford-flower/swin-large_4xb8_oxford_pet_mona.py `Your GPUs`

Oxford VOC

bash Swin-Transformer-Classification/tools/dist_train.sh Swin-Transformer-Classification/mona_configs/swin-l_oxford-flower/swin-large_4xb8_voc_mona.py `Your GPUs`

Citation

If our work is helpful for your research, please cite:


@misc{yin2023adapter,
      title={Adapter is All You Need for Tuning Visual Tasks}, 
      author={Dongshuo Yin and Leiyi Hu and Bin Li and Youqun Zhang},
      year={2023},
      eprint={2311.15010},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

We are grateful for the following, but not limited to these, wonderful open-source repositories.

leiyi-hu / mona Goto Github PK

mona's Introduction

Mona

Table of Contents

Introduction

Main Results

Getting Started

Object Detection & Instance Segmentation

Installation

Training Mona

COCO

VOC

Semantic Segmentation

Installation

Training Mona

ADE20K

Classification

Installation

Training Mona

Oxford Flower

Oxford Pet

Oxford VOC

Citation

Acknowledgement

mona's People

Contributors

Stargazers

Watchers

Forkers

mona's Issues

Recommend Projects

Recommend Topics

Recommend Org