Coder Social home page Coder Social logo

vit-finetune's Introduction

Fine-tuning Vision Transformers

Code for fine-tuning ViT models on various classification datasets. Includes options for full model, LoRA and linear fine-tuning procedures.

Available Datasets

Dataset --data.dataset
CIFAR-10 cifar10
CIFAR-100 cifar100
Oxford-IIIT Pet Dataset pets37
Oxford Flowers-102 flowers102
Food-101 food101
STL-10 stl10
Describable Textures Dataset dtd
Stanford Cars cars
FGVC Aircraft aircraft
Image Folder custom

Requirements

  • Python 3.8+
  • pip install -r requirements.txt

Usage

Training

  • To fine-tune a ViT-B/16 model on CIFAR-100 run:
python main.py fit --trainer.accelerator gpu --trainer.devices 1 --trainer.precision 16-mixed
--trainer.max_steps 5000 --model.warmup_steps 500 --model.lr 0.01
--trainer.val_check_interval 500 --data.batch_size 128 --data.dataset cifar100
  • config/ contains example configuration files which can be run with:
python main.py fit --config path/to/config
  • To get a list of all arguments run python train.py --help

Training on a Custom Dataset

To train on a custom dataset first organize the images into Image Folder format. Then set --data.dataset custom, --data.root path/to/custom/dataset and --data.num_classes <num-dataset-classes>.

Evaluate

To evaluate a trained model on its test set, find the path of the saved config file for the checkpoint (eg. output/cifar10/version_0/config.yaml) and run:

python main.py test --ckpt_path path/to/checkpoint --config path/to/config
  • Note: Make sure the --trainer.precision argument is set to the same level as used during training.

Results

All results are from fine-tuned ViT-B/16 models which were pretrained on ImageNet-21k (--model.model_name vit-b16-224-in21k).

Full Fine-tuning

Dataset Steps Warm Up Steps Learning Rate Test Accuracy Config
CIFAR-10 5000 500 0.01 99.00 Link
CIFAR-100 5000 500 0.01 92.89 Link
Oxford Flowers-102 1000 100 0.03 99.02 Link
Oxford-IIIT Pets 2000 200 0.01 93.68 Link
Food-101 5000 500 0.03 90.67 Link

LoRA

Dataset r Alpha Bias Steps Warm Up Steps Learning Rate Test Accuracy Config
CIFAR-100 8 8 None 5000 500 0.05 92.40 Link
Oxford-IIIT Pets 1 16 None 3000 100 0.05 93.30 Link
Oxford-IIIT Pets 8 8 None 3000 100 0.05 93.79 Link
Oxford-IIIT Pets 8 8 All 3000 300 0.05 93.76 Link

Linear Probe

Dataset Steps Warm Up Steps Learning Rate Test Accuracy Config
Oxford Flowers-102 2000 100 1.0 99.02 Link
Oxford-IIIT Pets 2000 100 0.5 92.64 Link

vit-finetune's People

Contributors

bwconrad avatar dependabot[bot] avatar wirthual avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.