ramprs / grad-cam Goto Github PK

View Code? Open in Web Editor NEW

1.4K 1.4K 218.0 1.58 MB

[ICCV 2017] Torch code for Grad-CAM

Home Page: https://arxiv.org/abs/1610.02391

Lua 85.45% Python 9.48% Shell 5.06%

convolutional-neural-networks deep-learning grad-cam heatmap iccv17 interpretability visual-explanation

grad-cam's People

Contributors

Stargazers

Watchers

Forkers

deshraj nieshaoshuai benjamesbabala ilovecv wanjinchang santara ml-lab vyraun vanova srishti-panda jjsong codeaudit virajprabhu insikk igeeks shanyuhu levilian csgaobb zhhezhhe utsavgarg fyang26 kirk86 jrcondenast jdc08161063 soroushmehr adityabalu chunjiah vsudhakar nagyist shifwang lgen ivanhehe dreadlord1984 zhaoyang1708 prantikbubun solomon1588 dimplesl csmanu yexinringmiscrep zhouqingping nguyenvo09 fendaq pandinosaurus briando2005 zhaobingbingbing shubhampachori12110095 meeravarshneya1234 tinyloop jurjsorinliviu yakigac khanhlt starstylesky alexrockhill ramananm amitkayal unforeseenocean nyyznyyz1991 guangshengshi githubpgq anirband chuangchuangtan stevenji lvyijin hsulin0806 jenny-nlc tony32769 jgabriellima xiang-zhe flovey gumpfly hsouporto yinmingjun caihengyu520 qiaoptdun 5059 upgirlnana civilpat limkimhuor aribido-oluwaseun chungpin advanpix sounak97 shunitami shashisingh jiangongwang felixofficial inhyek guoruiwang ryutian plan-t42 fbarez johnleehit lysarthas 3dimaging morrisqiu swhan9873 genhao3 yangyuantao jianjian0dandan rajaramkuberan

grad-cam's Issues

load alexnet failed when forward

If I use the alexnet and backend set to 'ccn2'. The code crashes when doing network forward on line local output = cnn:forward(img).

/data/Repo/torch.git/install/bin/luajit: /data/Repo/torch.git/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:

/data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:14: bad argument #1 to 'transpose' (out of range at /tmp/luarocks_torch-scm-1-6342/torch7/lib/TH/generic/THTensor.c:399)
stack traceback:
	[C]: in function 'transpose'
	/data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:14: in function </data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:12>
	[C]: in function 'xpcall'
	/data/Repo/torch.git/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...a/Repo/torch.git/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	classification.lua:68: in main chunk
	[C]: in function 'dofile'
	.../torch.git/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x00406670

Is there a model pre-trained on MSCOCO with 80classes?

consult：Running Environment

my bro what is your Running Environment，my tensorflow version 2.2.0 、 keras 2.31 ，There is an incompatible.

ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.

error about "classification.lua"

I run the command below
th classification.lua -input_image_path images/cat_dog.jpg -label 243 -gpuid -1

And I get such error message:

{
input_sz : 224
out_path : "output/"
seed : 123
label : 243
gpuid : -1
proto_file : "models/VGG_ILSVRC_16_layers_deploy.prototxt"
input_image_path : "images/cat_dog.jpg"
save_as_heatmap : 1
layer_name : "relu5_3"
backend : "nn"
model_file : "models/VGG_ILSVRC_16_layers.caffemodel"
output_image_name : ""
}
Couldn't load models/VGG_ILSVRC_16_layers.caffemodel
/home/nott/00.software/00.ml/torch/install/bin/luajit: classification.lua:48: attempt to index local 'cnn' (a nil value)
stack traceback:
classification.lua:48: in main chunk
[C]: in function 'dofile'
...0.ml/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406460

I'm not sure why it's like this. Could you help me?

Last layer activation method used in training on Pascal VOC

I got a small doubt regarding the last layer activation method used while training the model to generate the heatmaps. I assume you used softmax on top of the last layer logits for imagenet dataset. However, it is not mentioned anywhere how you trained the model with Pascal VOC dataset. Can you please provide some details on it? It will be of great help to me.
Thanks in advance!

Cannot use grad-cam with sequential-model keras

Hi,
I tried to apply grad-cam on my classification model builded on pretrained EfficientNet-B0. Exception as below:

May you help me fix this issue.
Thanks a lot!

test_caffe_model

Thanks a lot for sharing the very nice work! @ramprs @abhshkdz
How can i get the Gradient class activation maps with the model trained with caffe rather than pytorch models?
Moreover, could you please tell me how to generate the wonderful results as this website shows(https://medium.com/twentybn/visual-explanation-for-video-recognition-87e9ba2a675b)
Thank you very much!

Why do the results show linear or rectangular shape on heatmap?

Hello,

I was wondering why the localized results show certain pattern like lines or rectangular shape. Anybody knows about the phenomenon? Thank you in advance!

trained the resnet200 with my own dataset

Recently, I read your paper Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, I am very interested in it and have a few questions to ask you:

I trained the resnet200 with my own dataset, and observed the results with Grad-CAM. The result was poor, and the grad-cam graph was different after running the program every time. What was the reason?
Is this method related to the brightness of the object in the picture? Sometimes the brighter area is judged as the target.
The method mentioned in the paper can be used in segmentation. How do you use it?
Your help is very important to me, and I am looking forward to your reply.

How to retrain VGG network with my own data?

Thanks for sharing this amazing code. My question is that how to retrain or fine-tune VGG network with my own data?

How to compute the localization bounding box after get the saliency map?

How to compute the localization bounding box after get the saliency map?
I want to have localization experiment, but trapped in this problem.
Many Thanks.

it does not match exactly. Why?

Hi, @varunagrawal @abhshkdz @mcogswell @Ramprasaath @vrama91

As a result of classifying with Resnet, Accuarcy is over 99%. If you hit map the object area with gradCAM with that model file, it does not match exactly. Why?

It seems to be a problem of GradCAM rather than Resnet classification learning. The objects to be hit-mapped are not as local or blob like dogs or cats, but close to a long straight line. In this case, GradCAM seems to miss the object area. Have you experienced this?

For a well-trainedd Resnet model, how do you optimize GradCAM?

Thanks, in advance.

from @bemoregt.

Request to provide synset key

I have searched through the ILSVRC tables, and I cannot find any synsets where tigercat is 283 and boxer is 243:
Here is one example of the 1000 synsets:
http://image-net.org/challenges/LSVRC/2017/browse-synsets

Here, boxer is around line 90.
Can you provide a link to where I can retrieve the key mapping label-id, or upload a file with this key?

Grad cam for other arquitectures

Hello !
First of all I want to congratulate you on the fantastic work!
I would like to implement your technique in another architecture besides the ones presented here, for my master's dissertation. It is possible ?
Best regards !

can the grad-cam be deployed in tensorRT?

I cannot find the TensorRT interface to calculate the gradiends needed for grad-cam to do visualizations, which preventing me from deploying grad-cam to the production environments,

Gradient values and gradient relevance

Hi, I've been trying the code for a while and have a couple of questions about the gradients.
To give some context here, I would like to identify which feature maps are the most relevant for a specific class, e.g. not use all the feature maps for visualization, but only the most important.

So the paper says that after doing the back-propagation for a specific class, we average the gradients, and that captures the "importance" of a feature map. I've been exploring the distribution for each layer using alexnet and here I show the distribution of those averaged gradients for a specific layer in AlexNet:

So we have a distribution with both positive and negative gradients that are close to zero.
As the code is implemented, we use all of those gradients and that results in a visualization that looks like this for the class gondola from imagenet:

At first, I thought, ok I won't use all the gradients, I only want the gradients closest to zero and so I set a window to include those. And here comes my first question:

Are the gradients closest to zero (or zero), the ones that would require a smaller update for the kernels during training? meaning that those feature maps are the most relevant and meaningful for the final classification?

After trying that I didn't get an improved visualization, so I kept exploring the gradients. Then I decided to only use the positive or only the negative gradients, here are the results:
Only positive gradients:

Only negative gradients:

It seems for me that the negative gradients are the most meaningful, even more than using all the gradients, and the same happens for other images as well. Here I have my next question:

Why would the negative gradients be more relevant for the visualization?
What's going on with only the positive values and why are those guiding to a completely different visualization than in the other cases?

Thanks in advance for any answer and I'm looking forward for the discussion.

The results are differenct from Grad_CAM paper and Web_Demo.

This is a very interesting work! Thanks for your share. I found that there is a bit difference between your paper's result and web_demo's for cat visualization. The guided_cam result based on web_demo have some noises for "cat" (e.g., part of dog head).!

I also reproduce the Grad_CAM experiments with MatConvNet. But my result is similar to your web_demo. I want to know that how to produce the paper's result.

why not just use Gradient-base to gain a saliency may

Can grad-cam be used for yolov5?

Thanks for your work.
I read the paper and I have some questions about the usages of grad-cam.
The paper mentioned that grad-cam can be used in another cnn-based method. I wonder if yolo can use it.
Another question is that like these pictures

There are many objects in an image. Does it only show one class of these things?

Visual question answering demo server does not respond

Testing the demo server for visual question answering results in the following:

Job published successfully
Publishing job to VQA Queue
Starting Visual Question Answering job...
Submitted demo image

How to compute the gradient of the score for class?And what's the relationship between grident and visulization?

In grad-cam paper,the authors use the gradient of the score for class c namely y^c partial derivative of feature maps A^k of a convolutional layer to obtain the neuron importance weights.But how to compute the gradient of the score for class c y^c?And why?

grad can be implicitly created only for scalar outputs

I used unet to train the voc data set, loaded my own weights and took the activation of a certain layer in the middle, but got this error
“ grad can be implicitly created only for scalar outputs”

Visualization of networks without FC layers

Hi, it seems that in your work it is possible to visualize only the networks with FC layers between last conv layer and softmax. At the same time, the original CAM paper learned the "importance" weights (actually the introduced FC layer) separately using SVM presuming average pooling in the end. Could you comment on how to visualize some popular networks without FC layers like MobileNet etc. using Grad-CAM?

Can we use GradCAM to visualise the self-trained VGG, e.g., output classes num is 10?

I want to visualize some interests in images using GradCAM for our self-trained VGG with 10 classes output.
But when I converted my VGG weights to caffemodel and used GradCAM setting label -1, I found it predicts some label id out of 0-9, why?
Did I use GradCAM mistakenly?

Using grad-cam methods on fully connected layers

Hello, this is not an issue. However I wanted to open the discussion, and don't know of a better forum.

Are there any methods similar to (grad)-cam except applied to ANN's other than CNN architecture? More specifically using (grad)-cam methods on fully connected networks such as an MLP...