Coder Social home page Coder Social logo

ramprs / grad-cam Goto Github PK

View Code? Open in Web Editor NEW
1.4K 1.4K 218.0 1.58 MB

[ICCV 2017] Torch code for Grad-CAM

Home Page: https://arxiv.org/abs/1610.02391

Lua 85.45% Python 9.48% Shell 5.06%
convolutional-neural-networks deep-learning grad-cam heatmap iccv17 interpretability visual-explanation

grad-cam's People

Contributors

abhshkdz avatar ramprasaath avatar ramprs avatar varunagrawal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grad-cam's Issues

load alexnet failed when forward

If I use the alexnet and backend set to 'ccn2'. The code crashes when doing network forward on line local output = cnn:forward(img).

/data/Repo/torch.git/install/bin/luajit: /data/Repo/torch.git/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:

/data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:14: bad argument #1 to 'transpose' (out of range at /tmp/luarocks_torch-scm-1-6342/torch7/lib/TH/generic/THTensor.c:399)
stack traceback:
	[C]: in function 'transpose'
	/data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:14: in function </data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:12>
	[C]: in function 'xpcall'
	/data/Repo/torch.git/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...a/Repo/torch.git/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	classification.lua:68: in main chunk
	[C]: in function 'dofile'
	.../torch.git/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x00406670

consult:Running Environment

my bro what is your Running Environment,my tensorflow version 2.2.0 、 keras 2.31 ,There is an incompatible.

ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.

error about "classification.lua"

I run the command below
th classification.lua -input_image_path images/cat_dog.jpg -label 243 -gpuid -1

And I get such error message:

{
input_sz : 224
out_path : "output/"
seed : 123
label : 243
gpuid : -1
proto_file : "models/VGG_ILSVRC_16_layers_deploy.prototxt"
input_image_path : "images/cat_dog.jpg"
save_as_heatmap : 1
layer_name : "relu5_3"
backend : "nn"
model_file : "models/VGG_ILSVRC_16_layers.caffemodel"
output_image_name : ""
}
Couldn't load models/VGG_ILSVRC_16_layers.caffemodel
/home/nott/00.software/00.ml/torch/install/bin/luajit: classification.lua:48: attempt to index local 'cnn' (a nil value)
stack traceback:
classification.lua:48: in main chunk
[C]: in function 'dofile'
...0.ml/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406460

I'm not sure why it's like this. Could you help me?

Last layer activation method used in training on Pascal VOC

I got a small doubt regarding the last layer activation method used while training the model to generate the heatmaps. I assume you used softmax on top of the last layer logits for imagenet dataset. However, it is not mentioned anywhere how you trained the model with Pascal VOC dataset. Can you please provide some details on it? It will be of great help to me.
Thanks in advance!

trained the resnet200 with my own dataset

Recently, I read your paper Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, I am very interested in it and have a few questions to ask you:

  1. I trained the resnet200 with my own dataset, and observed the results with Grad-CAM. The result was poor, and the grad-cam graph was different after running the program every time. What was the reason?
  2. Is this method related to the brightness of the object in the picture? Sometimes the brighter area is judged as the target.
  3. The method mentioned in the paper can be used in segmentation. How do you use it?
    Your help is very important to me, and I am looking forward to your reply.

it does not match exactly. Why?

Hi, @varunagrawal @abhshkdz @mcogswell @Ramprasaath @vrama91

As a result of classifying with Resnet, Accuarcy is over 99%. If you hit map the object area with gradCAM with that model file, it does not match exactly. Why?

It seems to be a problem of GradCAM rather than Resnet classification learning. The objects to be hit-mapped are not as local or blob like dogs or cats, but close to a long straight line. In this case, GradCAM seems to miss the object area. Have you experienced this?

For a well-trainedd Resnet model, how do you optimize GradCAM?

Thanks, in advance.

from @bemoregt.

Grad cam for other arquitectures

Hello !
First of all I want to congratulate you on the fantastic work!
I would like to implement your technique in another architecture besides the ones presented here, for my master's dissertation. It is possible ?
Best regards !

can the grad-cam be deployed in tensorRT?

I cannot find the TensorRT interface to calculate the gradiends needed for grad-cam to do visualizations, which preventing me from deploying grad-cam to the production environments,

Gradient values and gradient relevance

Hi, I've been trying the code for a while and have a couple of questions about the gradients.
To give some context here, I would like to identify which feature maps are the most relevant for a specific class, e.g. not use all the feature maps for visualization, but only the most important.

So the paper says that after doing the back-propagation for a specific class, we average the gradients, and that captures the "importance" of a feature map. I've been exploring the distribution for each layer using alexnet and here I show the distribution of those averaged gradients for a specific layer in AlexNet:
image
So we have a distribution with both positive and negative gradients that are close to zero.
As the code is implemented, we use all of those gradients and that results in a visualization that looks like this for the class gondola from imagenet:
image
At first, I thought, ok I won't use all the gradients, I only want the gradients closest to zero and so I set a window to include those. And here comes my first question:

  1. Are the gradients closest to zero (or zero), the ones that would require a smaller update for the kernels during training? meaning that those feature maps are the most relevant and meaningful for the final classification?

After trying that I didn't get an improved visualization, so I kept exploring the gradients. Then I decided to only use the positive or only the negative gradients, here are the results:
Only positive gradients:
image
Only negative gradients:
image
It seems for me that the negative gradients are the most meaningful, even more than using all the gradients, and the same happens for other images as well. Here I have my next question:

  1. Why would the negative gradients be more relevant for the visualization?
  2. What's going on with only the positive values and why are those guiding to a completely different visualization than in the other cases?

Thanks in advance for any answer and I'm looking forward for the discussion.

The results are differenct from Grad_CAM paper and Web_Demo.

This is a very interesting work! Thanks for your share. I found that there is a bit difference between your paper's result and web_demo's for cat visualization. The guided_cam result based on web_demo have some noises for "cat" (e.g., part of dog head).!
web_demo result
paper reult
I also reproduce the Grad_CAM experiments with MatConvNet. But my result is similar to your web_demo. I want to know that how to produce the paper's result.

screenshot from 2016-10-29 12 53 38
screenshot from 2016-10-29 12 54 18

Can grad-cam be used for yolov5?

Thanks for your work.
I read the paper and I have some questions about the usages of grad-cam.
The paper mentioned that grad-cam can be used in another cnn-based method. I wonder if yolo can use it.
Another question is that like these pictures
image
image

There are many objects in an image. Does it only show one class of these things?

Visualization of networks without FC layers

Hi, it seems that in your work it is possible to visualize only the networks with FC layers between last conv layer and softmax. At the same time, the original CAM paper learned the "importance" weights (actually the introduced FC layer) separately using SVM presuming average pooling in the end. Could you comment on how to visualize some popular networks without FC layers like MobileNet etc. using Grad-CAM?

Using grad-cam methods on fully connected layers

Hello, this is not an issue. However I wanted to open the discussion, and don't know of a better forum.

Are there any methods similar to (grad)-cam except applied to ANN's other than CNN architecture? More specifically using (grad)-cam methods on fully connected networks such as an MLP...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.