Coder Social home page Coder Social logo

sebastian-lapuschkin / lrp_toolbox Goto Github PK

View Code? Open in Web Editor NEW
322.0 20.0 90.0 415.05 MB

The LRP Toolbox provides simple and accessible stand-alone implementations of LRP for artificial neural networks supporting Matlab and Python. The Toolbox realizes LRP functionality for the Caffe Deep Learning Framework as an extension of Caffe source code published in 10/2015.

License: Other

Python 6.60% Shell 0.42% MATLAB 1.87% CMake 1.33% Makefile 0.29% C++ 40.12% HTML 0.09% CSS 0.12% Jupyter Notebook 47.00% Cuda 2.14%

lrp_toolbox's Introduction

The LRP Toolbox for Artificial Neural Networks (1.3.1)

The Layer-wise Relevance Propagation (LRP) algorithm explains a classifer's prediction specific to a given data point by attributing relevance scores to important components of the input by using the topology of the learned model itself.

The LRP Toolbox provides simple and accessible stand-alone implementations of LRP for artificial neural networks supporting Matlab and python. The Toolbox realizes LRP functionality for the Caffe Deep Learning Framework as an extension of Caffe source code published in 10/2015.

The implementations for Matlab and python are intended as a sandbox or playground to familiarize the user to the LRP algorithm and thus are implemented with readability and transparency in mind. Models and data can be imported and exported using raw text formats, Matlab's .mat files and the .npy format for python/numpy/cupy.

See the LRP Toolbox in Action

To try out either the python-based MNIST demo, or the Caffe based ImageNet demo in your browser, click on the respective panels:

MNIST Images Text
A simple LRP demo based on neural networks that predict handwritten digits and were trained using the MNIST data set. A more complex LRP demo based on a neural network implemented using Caffe. The neural network predicts the contents of the picture. An LRP demo that explains classification on natural language. The neural network predicts the type of document.

Obtaining the LRP Toolbox

Clone or download it from github!

Installing the Toolbox

After having obtained the toolbox code, data and models of choice, simply move into the subpackage folder of you choice -- matlab, python or caffe-master-lrp -- and execute the installation script (written for Ubuntu 14.04 or newer).

<obtain the toolbox>
cd lrp_toolbox/$yourChoice
bash install.sh

Make sure to at least skim through the installation scripts! For more details and instructions please refer to the manual.

Attention Caffe-Users

We highly recommend building LRP for Caffe via the singularity image definition (You might regret doing something else outside of Ubuntu 14.04 LTS or Ubuntu 16.04 LTS...). In this case, we also recommend to only download the content of the singularity folder. Call

cd <toolbox_location>/singularity
singularity build --fakeroot --force caffe-lrp-cpu-u16.04.sif caffe-lrp-cpu-u16.04.def

and the go have a coffee. The resulting caffe-lrp-cpu-u16.04.sif is an (executable) Singularity image which allows you to process LRP (and other methods) for Caffe Models with

[singularity run] ./caffe-lrp-cpu-u16.04.sif -c CONFIGPATH -f FILELISTPATH -p OUTPUTPATH

Have a look at the manual for details.

The LRP Toolbox Paper

When using (any part) of this toolbox, please cite our paper

@article{JMLR:v17:15-618,
    author  = {Sebastian Lapuschkin and Alexander Binder and Gr{{\'e}}goire Montavon and Klaus-Robert M{{{\"u}}}ller and Wojciech Samek},
    title   = {The LRP Toolbox for Artificial Neural Networks},
    journal = {Journal of Machine Learning Research},
    year    = {2016},
    volume  = {17},
    number  = {114},
    pages   = {1-5},
    url     = {http://jmlr.org/papers/v17/15-618.html}
}

Misc & Related

For further research and projects involving LRP, visit heatmapping.org

Also, if you prefer Keras/Tensorflow or PyTorch, consider paying https://github.com/albermax/innvestigate (Keras/TF) and https://github.com/chr5tphr/zennit (PyTorch) a visit! Next to LRP, iNNvestigate efficiently implements a hand full of additional DNN analysis methods and can boast with a >500-fold increase in computation speed when compared with our CPU-bound Caffe implementation! Zennit provides the latest LRP Composites and provides tools to combine gradient- and modified backpropagation methods with, e.g., noise tunnel approaches on your GPU.

Updates and Version History

New in 1.3.1:

Caffe implementation

  • a slightly updated singularity image .def-file
  • formula 11 now implements the vanilla backprop gradient
  • formula 99 is now the only variant implementing Sensitivity Analysis

New in 1.3.0:

Standalone Python implementation:

  • update to python 3
  • updated treatment of softmax and target class
  • lrp_aware option for efficient calculation of multiple backward passes (at the cost of a more expensive forward pass)
  • custom colormaps in render.py
  • gpu support when cupy is installed. this is an optional feature. without the cupy package, the python code will execute using the cpu/numpy.

Caffe implementation

  • updated the installation config
  • new recommended formula types 100, 102, 104
  • support for Guided Backprop via formula type 166
  • new python wrapper to use lrp in pycaffe
  • pycaffe demo file
  • bugfixes
  • singularity image definition for building a hassle-free OS-agnostic command line executable

New in 1.2.0

The standalone implementations for python and Matlab:

  • Convnets with Sum- and Maxpooling are now supported, including demo code.
  • LRP-parameters can now be set for each layer individually
  • w² and flat weight decomposition implemented.

Caffe:

  • Minimal output versions implemented.
  • Matthew Zeiler et al.'s Deconvolution, Karen Simonyan et al.'s Sensitivity Maps, and aspects of Grégoire Montavon et al.'s Deep Taylor Decomposition are implemented, alongside the flat weight decomposition for uniformly projecting relevance scores to a neuron's receptive field have been implemented.

Also:

  • Various optimizations, refactoring, bits and pieces here and there.

lrp_toolbox's People

Contributors

ahmedmagdiosman avatar maxkohlbrenner avatar sebastian-lapuschkin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lrp_toolbox's Issues

problems with python modules MaxPool and SoftMax

I had some problems using the lrp-toolbox python interface.
When using the MaxPool module, it can happen that the relevance gets lost.
If there is an input which contains the max-value twice, no relevance would pass back throw this area.
I think I fixed it by adding a simple cast in the _simple_lrp function:
Z = ( self.Y[:,i:i+1,j:j+1,:] == self.X[:, i*hstride:i*hstride+hpool , j*wstride:j*wstride+wpool , : ]).astype(float)

The Second thing, where I'm not quite sure, is the lrp function of the SoftMax Module.
I think it just has to return the relevance R which it gets. Otherwise the sum of relevance would not be 1 at the end.
The Question is, whether there is a reason for the multiplication with the input-value?

LRP for 1. Regression 2. In Keras

Question 1

To use this toolbox for a regression task, I changed
#predict and perform LRP for the 10 first samples part of lrp_demo.py to the following:

for cId,ixs in enumerate(xs):  
    # forward pass and prediction
    ypred = nn.forward(ixs)
    print 'True Value:     ', ys[cId]
    print 'Predicted Value:', ypred, '\n'
    #
    R = nn.lrp(ypred)  # as Eq(56) from DOI: 10.1371/journal.pone.0130140
    # R = nn.lrp(ypred, 'alphabeta', 2.)  # as Eq(60) from DOI: 10.1371/journal.pone.0130140
    # R = nn.lrp(ypred, 'epsilon', 1.)  # as Eq(58) from DOI: 10.1371/journal.pone.0130140
    plt.figure();  # heatmap(attributions[:1],annot=1,annot_kws=np.expand_dims(Vocab,0))
    plt.plot(np.sort(R));
    plt.xticks(np.arange(Vocab.shape[0]),
               Vocab[np.argsort(R)]
               , rotation='vertical')
    plt.title('Ep'+str(cId))
    plt.show()

However, I get the following error. Kindly advise!

Traceback (most recent call last):
  File "/home/panditve/workspace/CurWorkDir/LrpExplainClose2BL.py", line 546, in <module>
    R = nn.lrp(ypred, 'alphabeta', 2.)  # as Eq(58) from DOI: 10.1371/journal.pone.0130140
  File "/home/panditve/workspace/CurWorkDir/modules/sequential.py", line 316, in lrp
    R = m.lrp(R,lrp_var,param)
  File "/home/panditve/workspace/CurWorkDir/modules/module.py", line 120, in lrp
    return self._alphabeta_lrp(R,param)
  File "/home/panditve/workspace/CurWorkDir/modules/linear.py", line 152, in _alphabeta_lrp
    Z = self.W[na,:,:]*self.X[:,:,na] # localized preactivations
IndexError: too many indices for array

OR

R = nn.lrp(ypred)  # as Eq(58) from DOI: 10.1371/journal.pone.0130140
  File "/home/panditve/workspace/CurWorkDir/modules/sequential.py", line 316, in lrp
    R = m.lrp(R,lrp_var,param)
  File "/home/panditve/workspace/CurWorkDir/modules/module.py", line 112, in lrp
    return self._simple_lrp(R)
  File "/home/panditve/workspace/CurWorkDir/modules/linear.py", line 114, in _simple_lrp
    Z = self.W[na,:,:]*self.X[:,:,na] #localized preactivations

xs and ys are both 2 dimensional [number of samples, number of features (or num predictions for ys)]. For the sake of simplicity, I am predicting only 1 dimension at the moment i.e. `ys.shape [N,1]' and I got the error above.

Question2:

In the code above, I rewrote my network model going by your modules class. However, in most of my work so far, I have trained numerous keras models, and it is hard to rewrite them all + network may end up having very different weights. Do you plan to release Keras model compatible code anytime soon? Or have I missed something and I can use lrp_toolbox for my Keras models already somehow??

Question 3:

As I understand, LRP (deep taylor decomposition) is powerful enough to be able to tell me which of the features contributed most to my model's continuous valued output, per https://www.sciencedirect.com/science/article/pii/S0031320316303582 and https://www.youtube.com/watch?v=gy_Cb4Do_YE Kindly let me know though, if I am anyway wrong in my understanding. :)

Alpha-Beta LRP not always conservative

The alpha-beta rule for LRP from Eq. 60 is not always conservative, here's some examples for the Linear module:

Zeros in the input

If the output contains zeros, the _alphabeta_lrp_slow method that implements the original Eq. 60 returns some NaNs due to divisions by zero.
Note that this case is very frequent if the previous activation is a ReLU.

import numpy as np
from modules import Linear

X = np.array([
  [0., 0.],
  [0., 1.],
  [1., 0.],
  [2., 4.]
])
linear = Linear(2, 1)
linear.W = np.array([
  [-1.],
  [+1.]
])
linear.B = np.zeros_like(linear.B)
Y = linear.forward(X)
print(Y)
>>> [[ 0.]
>>>  [ 1.]
>>>  [-1.]
>>>  [ 2.]]

R_y = np.where(Y != 0, 1, 0)
print(R_y)
>>> [[0]
>>>  [1]
>>>  [1]
>>>  [1]]

R_x = linear._alphabeta_lrp_slow(R_y, alpha=2)
print(R_x)
>>> [[nan nan]
>>>  [nan nan]
>>>  [nan nan]
>>>  [-1.  2.]]

The method _alphabeta_lrp adds a small number to denominator to avoid dividing by zero.
However the zeros remain at the numerator and break the conservation property:

R_x = linear._alphabeta_lrp(R_y, alpha=2)
print(R_x)
>>> [[ 0.  0.]
>>>  [ 0.  2.]
>>>  [-1.  0.]
>>>  [-1.  2.]]

Zeros in the weights

If the weight matrix contains zeros, a similar thing happens, with the zeros propagating to the numerators of the alpha and beta parts and breaking the balance between the two.
Again, this can happen frequently for networks that are trained with L1 regularization.

X = np.array([
  [1., 1.],
  [1., 2.]
])
linear = Linear(2, 1)
linear.W = np.array([
  [ 0.],
  [+1.]
])
linear.B = np.zeros_like(linear.B)
Y = linear.forward(X)
print(Y)
>>> [[1.]
>>>  [2.]]

R_y = np.where(Y != 0, 1, 0)
print(R_y)
>>> [[1]
>>>  [1]]

R_x = linear._alphabeta_lrp(R_y, alpha=2)
print(R_x)
>>> [[0. 2.]
>>>  [0. 2.]]

Can you suggest a way to mantain the conservation property in these cases? How is this done in the experiments that followed the initial paper?

Backward function definition

  • I am trying to define selu and elu activations. To achieve this, I looked at tanh.py. I could not understand tanh backward() function, which is
def backward(self,DY):
    return DY*(1.0-self.Y**2)

As I see it, DY*(1.0-self.Y**2) = DY*(1-tanh(self.X)**2) = DY* dy/dx (as, y=tanh(x)) ? I was expecting a calculation which rather returns dx, so rather the division sign next to DY i.e. DY/(1.0-self.Y**2).

  • There are no backward() functions defined for the softmax class?
  • If I understand correctly, both the backward() and forward() function definitions are irrelevant for the lrp calculations, once the weights are set. The two functions are only useful during the training phase of the model.

License for this project

Some of your codes has a row about the license of the codes,

@license : BSD-2-Clause

But, this repository has a license file, saying

All rights reserved. Free for academic use only. Patent pending.

If the license file is your choice for the project, you should remove the row about the license at the source codes because it will make misleading. (Strictly speaking, it is NOT the BSD-2-Clause.)

Of course you can choose the license of the codes, but we all wish you to keep it as OSS.
I recommend you to change the license file to be simple BSD-2-Clause.
If you care about the patent of the project, apache 2.0 license would be a good choice. (It's a just my opinion, I can't promise you about legal advice.)

Thank you for making awesome contribution for machine learning researches.

Load error in online demo

Hi,

When you load the following page in Firefox you get the error:

Bollocks, something went wrong!

http://heatmapping.org/mnist.html

After clicking on "Handwriting Classification" the proper page loads.

If going to the website https://lrpserver.hhi.fraunhofer.de/handwriting-classification directly (opening in a new window), it will show less options than if embedded in heatmapping.org, for some reason. It will just have a classify button, whereas the embedded version has the options:

  • Relevance Propagation Formula
  • Beta
  • Model
  • Hatmap Color Map

Work with document

Hi,

I am also interested in the prediction of the type of document. Do you have any plan to provide example for this application?

Implement LRP for regression

Hi and thanks for your work about LRP and it's a really nice work towards improving interpretability of neural network!
Currently I am looking for a suitable way to explain my model in a specific task. My task is trying to build a mapping from X to Y where Y is continuous values in stead of discrete class labels. So I am wondering if I can directly use your LRP here?

And my task is a multilabel regression which means I need to predict 3 different & independent outputs simultaneously as #12 indicated. So how can I do that? I saw you apply a mask on the feature dimension to mask the largest value among all different features as 1 and others as 0. But I would rather keep the ground truth as its original continuous values....so how can I do that?

By the way, can I implement LRP just like SHAP? for example building model independently then shap.DeepExplainer(model, background). I saw your code that I need to rewrite my model structure and add LRP in each layer, can I just avoid rewriting my model?
Thanks and hope for your early reply!

Issues about reading the model

Thank you very much for sharing your code.
I am really interested in your work.
I first converted the Keras model (.h5) to txt. The dataset is CIFAR-10. An Error occurred when reading the txt model.
Here are the source codes:
model_path = model_folder + '/' + self.model_name
write(model_path, model_path, num_channels=test_inputs[0].shape[-1], fmt='keras_txt')
lrpmodel = read(model_path + '.txt', 'txt')
I got the following message:
image

Could you please help me to solve this issue?
Thank you very much.

LRP for multitask regression learning

Summarised Question:

> How do I compute relevance of inputs for predicting ONE of the many outputs?

What I attempted so far:

As per your advise here, I now train my regression model in keras --> I copy over the weights and model topology (#layers, #nodes, activations), and rebuild it with lrp toolbox modules to use your LRP/DTD computation.

Now, if I want to run multitask learning, = say I have 3 output predictions (regression on all the three outputs), how do I compute relevance of the input features for predicting a certain output, say output1?

For the simple single output case, I have been doing the following:

ypred = nn.forward(xs)       #xs is 2 dimensional = [#samples, #features]
Rsum = nn.lrp(ypred,'epsilon',1.).sum(axis=0)  
 #Summing the relevance scores across samples, Rsum=[1,#features ]

For the 3 outputs case [op0,op1,op2], to compute relevance of input features in predicting op1, do I do this??:

ypred = nn.forward(xs)[:,1]       #NOTE [:,1] !!! Rest same.
Rsum = nn.lrp(ypred,'epsilon',1.).sum(axis=0)  
 #Summing the relevance scores across samples, Rsum=[1,#features ]

I then get this error:

Traceback (most recent call last):
  File "/home/panditve/workspace/CurWorkDir/LrpExplainScikitMultiTask.py", line 833, in <module>
    Rsum[0,:] = nn.lrp(ypred).sum(axis=0)
  File "/home/panditve/workspace/CurWorkDir/modules/sequential.py", line 316, in lrp
    R = m.lrp(R,lrp_var,param)
  File "/home/panditve/workspace/CurWorkDir/modules/module.py", line 112, in lrp
    return self._simple_lrp(R)
  File "/home/panditve/workspace/CurWorkDir/modules/linear.py", line 129, in _simple_lrp
    return ((Z / Zs) * R[:,na,:]).sum(axis=2)
IndexError: too many indices for array

In attempt to rootcause, I printed array shapes before the error line, got me this:

print(Z.shape)     #Works
print(Zs.shape)   #Works
print(R.shape)    #Works
print(R[:,na,:].shape) # ERROR!
print(((Z / Zs) * R[:,na,:]).shape)
return ((Z / Zs) * R[:,na,:]).sum(axis=2)

(14033, 16, 3)  #Z.shape
(14033, 1, 3)    #Zs.shape
(14033,)           #R.shape
Traceback (most recent call last):
  File "/home/panditve/workspace/CurWorkDir/LrpExplainScikitMultiTask.py", line 833, in <module>
    Rsum[0,:] = nn.lrp(ypred).sum(axis=0)
  File "/home/panditve/workspace/CurWorkDir/modules/sequential.py", line 316, in lrp
    R = m.lrp(R,lrp_var,param)
  File "/home/panditve/workspace/CurWorkDir/modules/module.py", line 112, in lrp
    return self._simple_lrp(R)
  File "/home/panditve/workspace/CurWorkDir/modules/linear.py", line 127, in _simple_lrp
    print(R[:,na,:].shape)
IndexError: too many indices for array

Possibility of loading a .h5 model from keras?

Hi I wanted to try out your toolbox with my custom model generated with keras. Keras saves models as .h5 files. What do I have to do to use your toolbox with my won model. Loading the h5 file doe not work.

Backward_Relevance_cpu implementation on new layers

Hi,

I want to try the toolbox with a ResNet-50 architecture. The problem is that it uses some layers that were not present in the Caffe version of the toolbox. In particular, the BatchNorm, Bias and Scale layers are missing. I've already copied them from the newest Caffe version, but I'm missing the implementation of Backward_Relevance_cpu in these layers. Do you have any plans on implementing them? If not, any pointers on how to implement it?

Using wrong axis in TF implementation

Hi,

I think there is a bug in the TF implementation for convolution. Expanding is done at differrent positions. In my opinion the second one is correct (expand_dims(axis=3)). From the repository:

def _simple_lrp(self,R):
result = tf.reduce_sum((Z/Zs) * tf.expand_dims(self.R[:,i:i+1,j:j+1,:], 1), 4)

compared to

def _flat_lrp(self,R):
result = tf.reduce_sum((Z/Zs) * tf.expand_dims(self.R[:,i:i+1,j:j+1,:], 3), 4)

The pure Python implementation always expands at the dimensions at position 3. Is this a bug or am I overlooking something?

Negative relevance in regression?

The relevance value is always positive right?

If a certain feature is responsible for reducing the regression output big time (therefore highly relevant), would this be captured with high negative relevance? Definition 2 in Explaining nonlinear classification decisions with deep Taylor decomposition would not allow this as I see it.

As for my regression experiment though, I do see negative R values!

I summed the R values (for the input features, across all the good predictions = samples for which prediction error is less than some tolerance value), there are negaitve values in the resultant summation(R) vector. This is true for R = nn.lrp(ypred) and R = nn.lrp(ypred, 'alphabeta', 2.) and R = nn.lrp(ypred, 'epsilon', 1.) --> R = R.sum(axis=0) (where ypred=nn.forward(xs[selectedIndices]))

Making all the regression labels positive (by adding a constant term) still gave me the negative values in R.sum(axis=0)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.