Coder Social home page Coder Social logo

grad-cam's Introduction

Grad-CAM: Gradient-weighted Class Activation Mapping

Code for the paper

Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, Dhruv Batra
https://arxiv.org/abs/1610.02391

Demo: gradcam.cloudcv.org

Overview

Usage

Download Caffe model(s) and prototxt for VGG-16/VGG-19/AlexNet using sh models/download_models.sh.

Classification

th classification.lua -input_image_path images/cat_dog.jpg -label 243 -gpuid 0
th classification.lua -input_image_path images/cat_dog.jpg -label 283 -gpuid 0
Options
  • proto_file: Path to the deploy.prototxt file for the CNN Caffe model. Default is models/VGG_ILSVRC_16_layers_deploy.prototxt
  • model_file: Path to the .caffemodel file for the CNN Caffe model. Default is models/VGG_ILSVRC_16_layers.caffemodel
  • input_image_path: Path to the input image. Default is images/cat_dog.jpg
  • input_sz: Input image size. Default is 224 (Change to 227 if using AlexNet)
  • layer_name: Layer to use for Grad-CAM. Default is relu5_3 (use relu5_4 for VGG-19 and relu5 for AlexNet)
  • label: Class label to generate grad-CAM for (-1 = use predicted class, 283 = Tiger cat, 243 = Boxer). Default is -1. These correspond to ILSVRC synset IDs
  • out_path: Path to save images in. Default is output/
  • gpuid: 0-indexed id of GPU to use. Default is -1 = CPU
  • backend: Backend to use with loadcaffe. Default is nn
  • save_as_heatmap: Whether to save heatmap or raw Grad-CAM. 1 = save heatmap, 0 = save raw Grad-CAM. Default is 1
Examples

'border collie' (233)

'tabby cat' (282)

'boxer' (243)

'tiger cat' (283)

Visual Question Answering

Clone the VQA (http://arxiv.org/abs/1505.00468) sub-repository (git submodule init && git submodule update), and download and unzip the provided extracted features and pretrained model.

th visual_question_answering.lua -input_image_path images/cat_dog.jpg -question 'What animal?' -answer 'dog' -gpuid 0
th visual_question_answering.lua -input_image_path images/cat_dog.jpg -question 'What animal?' -answer 'cat' -gpuid 0

Options
  • proto_file: Path to the deploy.prototxt file for the CNN Caffe model. Default is models/VGG_ILSVRC_19_layers_deploy.prototxt
  • model_file: Path to the .caffemodel file for the CNN Caffe model. Default is models/VGG_ILSVRC_19_layers.caffemodel
  • input_image_path: Path to the input image. Default is images/cat_dog.jpg
  • input_sz: Input image size. Default is 224 (Change to 227 if using AlexNet)
  • layer_name: Layer to use for Grad-CAM. Default is relu5_4 (use relu5_3 for VGG-16 and relu5 for AlexNet)
  • question: Input question. Default is What animal?
  • answer: Optional answer (For eg. "cat") to generate Grad-CAM for ('' = use predicted answer). Default is ''
  • out_path: Path to save images in. Default is output/
  • model_path: Path to VQA model checkpoint. Default is VQA_LSTM_CNN/lstm.t7
  • gpuid: 0-indexed id of GPU to use. Default is -1 = CPU
  • backend: Backend to use with loadcaffe. Default is cudnn
  • save_as_heatmap: Whether to save heatmap or raw Grad-CAM. 1 = save heatmap, 0 = save raw Grad-CAM. Default is 1
Examples

What animal? Dog

What animal? Cat

What color is the fire hydrant? Green

What color is the fire hydrant? Yellow

What color is the fire hydrant? Green and Yellow

What color is the fire hydrant? Red and Yellow

Image Captioning

Clone the neuraltalk2 sub-repository. Running sh models/download_models.sh will download the pretrained model and place it in the neuraltalk2 folder.

Change lines 2-4 of neuraltalk2/misc/LanguageModel.lua to the following:

local utils = require 'neuraltalk2.misc.utils'
local net_utils = require 'neuraltalk2.misc.net_utils'
local LSTM = require 'neuraltalk2.misc.LSTM'
th captioning.lua -input_image_path images/cat_dog.jpg -caption 'a dog and cat posing for a picture' -gpuid 0
th captioning.lua -input_image_path images/cat_dog.jpg -caption '' -gpuid 0

Options
  • input_image_path: Path to the input image. Default is images/cat_dog.jpg
  • input_sz: Input image size. Default is 224 (Change to 227 if using AlexNet)
  • layer: Layer to use for Grad-CAM. Default is 30 (relu5_3 for vgg16)
  • caption: Optional input caption. No input will use the generated caption as default
  • out_path: Path to save images in. Default is output/
  • model_path: Path to captioning model checkpoint. Default is neuraltalk2/model_id1-501-1448236541.t7
  • gpuid: 0-indexed id of GPU to use. Default is -1 = CPU
  • backend: Backend to use with loadcaffe. Default is cudnn
  • save_as_heatmap: Whether to save heatmap or raw Grad-CAM. 1 = save heatmap, 0 = save raw Grad-CAM. Default is 1
Examples

a dog and cat posing for a picture

a bathroom with a toilet and a sink

License

BSD

3rd-party

grad-cam's People

Contributors

abhshkdz avatar ramprasaath avatar ramprs avatar varunagrawal avatar

Watchers

paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.