ramprs / grad-cam Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2017] Torch code for Grad-CAM
Home Page: https://arxiv.org/abs/1610.02391
[ICCV 2017] Torch code for Grad-CAM
Home Page: https://arxiv.org/abs/1610.02391
If I use the alexnet and backend set to 'ccn2'. The code crashes when doing network forward on line local output = cnn:forward(img)
.
/data/Repo/torch.git/install/bin/luajit: /data/Repo/torch.git/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
/data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:14: bad argument #1 to 'transpose' (out of range at /tmp/luarocks_torch-scm-1-6342/torch7/lib/TH/generic/THTensor.c:399)
stack traceback:
[C]: in function 'transpose'
/data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:14: in function </data/Repo/torch.git/install/share/lua/5.1/nn/Transpose.lua:12>
[C]: in function 'xpcall'
/data/Repo/torch.git/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
...a/Repo/torch.git/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
classification.lua:68: in main chunk
[C]: in function 'dofile'
.../torch.git/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
my bro what is your Running Environment,my tensorflow version 2.2.0 、 keras 2.31 ,There is an incompatible.
ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.
I run the command below
th classification.lua -input_image_path images/cat_dog.jpg -label 243 -gpuid -1
And I get such error message:
{
input_sz : 224
out_path : "output/"
seed : 123
label : 243
gpuid : -1
proto_file : "models/VGG_ILSVRC_16_layers_deploy.prototxt"
input_image_path : "images/cat_dog.jpg"
save_as_heatmap : 1
layer_name : "relu5_3"
backend : "nn"
model_file : "models/VGG_ILSVRC_16_layers.caffemodel"
output_image_name : ""
}
Couldn't load models/VGG_ILSVRC_16_layers.caffemodel
/home/nott/00.software/00.ml/torch/install/bin/luajit: classification.lua:48: attempt to index local 'cnn' (a nil value)
stack traceback:
classification.lua:48: in main chunk
[C]: in function 'dofile'
...0.ml/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406460
I'm not sure why it's like this. Could you help me?
I got a small doubt regarding the last layer activation method used while training the model to generate the heatmaps. I assume you used softmax on top of the last layer logits for imagenet dataset. However, it is not mentioned anywhere how you trained the model with Pascal VOC dataset. Can you please provide some details on it? It will be of great help to me.
Thanks in advance!
Thanks a lot for sharing the very nice work! @ramprs @abhshkdz
How can i get the Gradient class activation maps with the model trained with caffe rather than pytorch models?
Moreover, could you please tell me how to generate the wonderful results as this website shows(https://medium.com/twentybn/visual-explanation-for-video-recognition-87e9ba2a675b)
Thank you very much!
Hello,
I was wondering why the localized results show certain pattern like lines or rectangular shape. Anybody knows about the phenomenon? Thank you in advance!
Recently, I read your paper Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, I am very interested in it and have a few questions to ask you:
Thanks for sharing this amazing code. My question is that how to retrain or fine-tune VGG network with my own data?
How to compute the localization bounding box after get the saliency map?
I want to have localization experiment, but trapped in this problem.
Many Thanks.
Hi, @varunagrawal @abhshkdz @mcogswell @Ramprasaath @vrama91
As a result of classifying with Resnet, Accuarcy is over 99%. If you hit map the object area with gradCAM with that model file, it does not match exactly. Why?
It seems to be a problem of GradCAM rather than Resnet classification learning. The objects to be hit-mapped are not as local or blob like dogs or cats, but close to a long straight line. In this case, GradCAM seems to miss the object area. Have you experienced this?
For a well-trainedd Resnet model, how do you optimize GradCAM?
Thanks, in advance.
from @bemoregt.
I have searched through the ILSVRC tables, and I cannot find any synsets where tigercat is 283 and boxer is 243:
Here is one example of the 1000 synsets:
http://image-net.org/challenges/LSVRC/2017/browse-synsets
Here, boxer is around line 90.
Can you provide a link to where I can retrieve the key mapping label-id, or upload a file with this key?
Hello !
First of all I want to congratulate you on the fantastic work!
I would like to implement your technique in another architecture besides the ones presented here, for my master's dissertation. It is possible ?
Best regards !
I cannot find the TensorRT interface to calculate the gradiends needed for grad-cam to do visualizations, which preventing me from deploying grad-cam to the production environments,
Hi, I've been trying the code for a while and have a couple of questions about the gradients.
To give some context here, I would like to identify which feature maps are the most relevant for a specific class, e.g. not use all the feature maps for visualization, but only the most important.
So the paper says that after doing the back-propagation for a specific class, we average the gradients, and that captures the "importance" of a feature map. I've been exploring the distribution for each layer using alexnet and here I show the distribution of those averaged gradients for a specific layer in AlexNet:
So we have a distribution with both positive and negative gradients that are close to zero.
As the code is implemented, we use all of those gradients and that results in a visualization that looks like this for the class gondola from imagenet:
At first, I thought, ok I won't use all the gradients, I only want the gradients closest to zero and so I set a window to include those. And here comes my first question:
After trying that I didn't get an improved visualization, so I kept exploring the gradients. Then I decided to only use the positive or only the negative gradients, here are the results:
Only positive gradients:
Only negative gradients:
It seems for me that the negative gradients are the most meaningful, even more than using all the gradients, and the same happens for other images as well. Here I have my next question:
Thanks in advance for any answer and I'm looking forward for the discussion.
This is a very interesting work! Thanks for your share. I found that there is a bit difference between your paper's result and web_demo's for cat visualization. The guided_cam result based on web_demo have some noises for "cat" (e.g., part of dog head).!
I also reproduce the Grad_CAM experiments with MatConvNet. But my result is similar to your web_demo. I want to know that how to produce the paper's result.
Thanks for your work.
I read the paper and I have some questions about the usages of grad-cam.
The paper mentioned that grad-cam can be used in another cnn-based method. I wonder if yolo can use it.
Another question is that like these pictures
There are many objects in an image. Does it only show one class of these things?
Testing the demo server for visual question answering results in the following:
Job published successfully
Publishing job to VQA Queue
Starting Visual Question Answering job...
Submitted demo image
In grad-cam paper,the authors use the gradient of the score for class c namely y^c partial derivative of feature maps A^k of a convolutional layer to obtain the neuron importance weights.But how to compute the gradient of the score for class c y^c?And why?
I used unet to train the voc data set, loaded my own weights and took the activation of a certain layer in the middle, but got this error
“ grad can be implicitly created only for scalar outputs”
Hi, it seems that in your work it is possible to visualize only the networks with FC layers between last conv layer and softmax. At the same time, the original CAM paper learned the "importance" weights (actually the introduced FC layer) separately using SVM presuming average pooling in the end. Could you comment on how to visualize some popular networks without FC layers like MobileNet etc. using Grad-CAM?
I want to visualize some interests in images using GradCAM for our self-trained VGG with 10 classes output.
But when I converted my VGG weights to caffemodel and used GradCAM setting label -1, I found it predicts some label id out of 0-9, why?
Did I use GradCAM mistakenly?
Hello, this is not an issue. However I wanted to open the discussion, and don't know of a better forum.
Are there any methods similar to (grad)-cam except applied to ANN's other than CNN architecture? More specifically using (grad)-cam methods on fully connected networks such as an MLP...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.