Coder Social home page Coder Social logo

jusiro / clap Goto Github PK

View Code? Open in Web Editor NEW
47.0 4.0 3.0 1.5 MB

[CVPR 2024] Validation-free few-shot adaptation of CLIP, using a well-initialized Linear Probe (ZSLP) and class-adaptive constraints (CLAP).

License: MIT License

Python 98.21% Shell 1.79%

clap's Introduction

CLass adaptive Linear Probing (CLAP)

The official implementation of A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models.
Julio Silva-Rodriguez, Sina Hajimiri, Ismail Ben Ayed, Jose Dolz
ÉTS Montreal
| Project | Paper | Code |

When adapting CLIP using only few-shot, it is unrealistic to assume the presence of a validation subset to empirically fix a set of hyperparameters per task, i.e. model selection. We propose two solutions, which do not require any hyperparameter tuning, and thus is adapted strictly using only the support samples.

  • A revisited zero-shot initialized Linear Probe (ZS-LP), tailored for CLIP-alike vision-language models.
  • A constraint formulation to retain prior knowledge of the robust zero-shot prototypes per class, CLass adaptive Linear Probing (CLAP).

Installation

This repository requires to install the environment and datasets:

  • follow here to install Dassl.pytorch and PyTorch.
  • run pip install -r requirements.txt under CLAP/ to install a few more packages required by CLIP (this should be done when dassl is activated).
  • follow DATASETS.md to install the datasets.

PS: You can also follow CoOp to perform the installation.

Usage

We present the basic usage here.

(a) Zero-shot initialized Linear Probe (ZS-LP):

  • bash scripts/adapt.sh 0 imagenet SGD_lr1e-1_B256_ep300 1 ZS none RN50

(b) CLass adaptive Linear Probing (CLAP):

  • bash scripts/adapt.sh 0 imagenet SGD_lr1e-1_B256_ep300 1 ZS l2 RN50

(c) Test domain generalization:

  • bash scripts/eval.sh 0 imagenet imagenetv2 SGD_lr1e-1_B256_ep300 1 ZS l2 RN50

Acknowledgment

This repository is mainly based on CoOp and TaskRes code base. We sincerely thank prior authors on this topic for his awesome code base.

Citation

If you find this repository useful, please consider citing this paper:

@inproceedings{clap24,
    title={A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models},
    author={Julio Silva-Rodr\'iguez and Sina Hajimiri and Ismail Ben Ayed and Jose Dolz},
    booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2024}
    }

clap's People

Contributors

jusiro avatar

Stargazers

Chengdong CAO avatar  avatar Malik Hashmat avatar phurich avatar Sohail Ahmed Khan avatar shuyueW1991 avatar Yong Sun avatar Huo avatar AndyBear avatar hanruisong avatar Heitor Rapela Medeiros avatar  avatar Eric W.K. avatar Jack Li avatar  avatar ByungHyun Kim avatar wuyujack (Mingfu Liang) avatar Yushu Li avatar Yabin Zhang avatar Barack Bao avatar Youngtaek Oh avatar Howard Wang avatar  avatar Fereshteh Shakeri avatar hanbowen0811 avatar FJDEV avatar Meng Shen avatar ChanWoong Kwak avatar Hyungwook Choi avatar Henry avatar JerExJs avatar Harpreet Sahota avatar zy avatar Xingye Chen avatar Zhang Huixin avatar Tian Liu avatar tayler-tan avatar  avatar IronMan avatar Anny Maza avatar Ismail Ben Ayed avatar Mohammad Reza Taesiri avatar Laurent Letourneau-Guillon avatar Balamurali avatar tim avatar Jose Dolz avatar Sina Hajimiri avatar

Watchers

Kostas Georgiou avatar shuyueW1991 avatar IronMan avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.