Coder Social home page Coder Social logo

gscorecam's Introduction

gScoreCAM: What is CLIP looking at?

tldr: Based on the observations that CLIP ResNet-50 channels are very noisy compared to typical ImageNet-trained ResNet-50, and most saliency methods obtain pretty low object localization scores with CLIP. By visualizing the top 10% most sensitive (highest-gradient) channels, our gScoreCAM obtains the state of the art weakly supervised localization results using CLIP (in both ResNet and ViT versions).

Official Implementation for the paper gScoreCAM: What is CLIP looking at? (2022) by Peijie Chen, Qi Li, Saad Biaz, Trung Bui, and Anh Nguyen. โญ Oral paper at ACCV 2022. โญ

If you use this software, please consider citing:

@inproceedings{chen2022gScoreCAM,
  title={gScoreCAM: What is CLIP looking at?},
  author={Peijie Chen, Qi Li, Saad Biaz, Trung Bui, and Anh Nguyen},
  booktitle={Proceedings of the Asian Conference on Computer Vision (ACCV)},
  year={2022}
}

See how it works

๐ŸŒŸ Interactive Colab demo: Open In Colab

๐ŸŒŸ Run it on Replicate:

Prerequisite

Install annconda following the anaconda installation documentation. Create an enviroment with all required packages with the following command :

conda env create -f gscorecam_env.yml

Interative CLI

Other than the Colab demo above, we provide a interative command line tool for testing different visualization methods. You may use it with:

python visualize_cam.py --cam-version [CAM version] --image-folder [path to testing images] --image-src [name of the datset]

Usage Sample 1: Run on MS COCO

You will need to download the MS COCO dataset and the meta data.

python visualize_cam.py --cam-version gscorecam --image-folder path_to_coco --image-src coco

The program will prompt you with a question asking if you would like to go for specific class or random class, you could simply tpye the class name or press enter for random classes.

Image here

After the class is chosen, the script will then ask for a prompt: Image here

For example, I want to see if the model can react to heart. Simply type heart and then enter. After a while, you will see: Image here On the left is the original image, the right image is the heatmap of the model overlap on the original image.

Usage Sample 2:

Instead of runing on a specific dataset, you could run on any folder that only contain images:

python visualize_cam.py --cam-version gscorecam --image-folder path_to_image_folder 

The interative script will be the same as above.

Evaluation code

In order to use the evaluation code, you will need to download the meta data from Google Drive. We extract the metat data of IamgeNetv2, COCO, and PartImageNet into .hdf5 format for convenience.

COCO evalutaion

You may run the evalution code with the following command:

python evaluate_cam.py info-ground-eval --model-name RN50x16 --cam-version gscorecam --image-src coco --image-folder path_to_image --meta-file --is_clip meta_data/coco_val_instances_stats.hdf5

You may need to change the path accordingly.

PartImageNet evaluation

Similar to COCO evaluation, simply run:

python eval_partsImageNet.py info-ground-eval --model-name RN50x16 --cam-version gscorecam --image-src coco --image-folder path_to_image --meta-file meta_data/partsImageNet_parts_test.hdf5

ImageNetv2 evaluation

To evaluate ImageNetv2, we use Choe et al's evaluation script directly. Please first clone this repo and then follow their data preparation instruction to download and prepare the data. We use this script provided in their repo, you may run the script as follows:

cd wsolevaluation
./dataset/prepare_imagenet.sh

Then you can evaluate on these heatmaps with Choe et al.'s evaluation script:

python evaluation.py --scoremap_root {FOLDER_OF_HEATMAPS} --dataset_name imagenet

Generate heatmap for ImageNetv2

To generate heatmaps from ImageNetv2 evaluation above, make sure you are under gScoreCAM folder. Then you may get the heatmap with the following command:

python wsol_compute_heatmap.py main --model RN50x16 --method gscorecam --dataset imagenet --is-clip

gscorecam's People

Contributors

chanfeechen avatar anguyen8 avatar ariel415el avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.