Coder Social home page Coder Social logo

Comments (13)

ludwigschubert avatar ludwigschubert commented on May 17, 2024

Are you optimizing a tensor behind a ReLU by any chance?

from lucid.

JaspervDalen avatar JaspervDalen commented on May 17, 2024

at some places however not everywhere. I am trying to make spritemaps for every layer and at the moment I am doing it for every layer which does not give an error (no gradient) this includes the ones behind a ReLU. However these are not the only ones. But is it not possible to optimise behind a Relu when you get deeper in a network? At the starting layers it does give correct outputs.

from lucid.

JaspervDalen avatar JaspervDalen commented on May 17, 2024

ah I found this:

One isse you will run into is that certain neurons you want to inspect won't activate from just the random noise you feed them, and if they don't activate then a ReLU non-linearity will block all the gradient, so you won't be able to optimize your input.

There are two ways around this: override non-linearity gradients to the identity op for the first couple optimization steps, or create new tensors in your graph that correspond to the tensors you want to optimize, but before their ReLU non-linearity.

We don't have either of those options automated so far. Look at lucid's googlenet/inception class to see how we "crawl" the graph and add _pre_relu nodes for tensors we're interested in.

I tried to see how you added pre_relu nodes which I think you do here. However this does not seem to work for mobilenets. It is mainly because of this tower = tower.op.inputs[0], I am not sure what inputs[0] should be as the only place that includes op.type==Relu is inputs[0] in mobilenet v1.

from lucid.

JaspervDalen avatar JaspervDalen commented on May 17, 2024

I also wondered for how many iteration do you run the render_vis in order to create the spritemaps you have? Becaues the default is 512 which seems to few.

from lucid.

ludwigschubert avatar ludwigschubert commented on May 17, 2024

I just realized you're likely on the most current version, which does try its best to automatically override ReLU gradients during model import anyway—so that shouldn't even be the problem. (For reference: it's the relu_gradient_override=True parameter on render_vis.)

Number of iterations is just a hyperparameter—in Feature Visualization we reported 2048 steps, but this will depend on your architecture, optimizer, step size, parameterization, etc. Feel free to tune it.

On a more general level note that interpretability techniques atm have a big uncertainty around them when they don't work as expected: is the technique not working, or is the model truly behaving in weird ways? Progress on this is an open research problem and feels outside the scope of a Github issue. I'd always be happy to take a look at an easy reproduction in (e.g.) a colab notebook. :-)

from lucid.

JaspervDalen avatar JaspervDalen commented on May 17, 2024

an example can be found in: https://colab.research.google.com/drive/1OFesr3ceAaPGmU_gNtDy1djn_9GZgI81

from lucid.

JaspervDalen avatar JaspervDalen commented on May 17, 2024

These are some images from the layers, As can be seen they become more and more grey and more iterations or other variables don't seem te work for me.
conv1_conv2d 1
conv_dw_1_depthwise
conv_dw_2_depthwise
conv_dw_3_depthwise
conv_dw_4_depthwise
conv_dw_5_depthwise
conv_dw_6_depthwise
conv_dw_7_depthwise

from lucid.

ludwigschubert avatar ludwigschubert commented on May 17, 2024

Hey @JaspervDalen I took a long look—I could not find anything obviously wrong in your approach.
Our code has not been (extensively) tested on models with BatchNorm layers, and we haven't tested if graphs imported from keras may behave subtly differently. We will follow-up on both, but I unfortunately can't guarantee a timeline.

If you want to debug more yourself, you may find it useful to start by finding out why so little gradient reaches the input during optimization. Sorry I can't be of more help atm!

(I am skeptical of your image_value_range; it looks to me like Keras preprocesses it to (-1, 1) for tf models. This does not seem to solve your problem, though.)

from lucid.

JaspervDalen avatar JaspervDalen commented on May 17, 2024

Ok thank you, the image_value_range is an artifact from my trying different ways of using lucid as I at the start when playing a bit with your library managed to get outputs in these layers. However since then I updated every lib (keras,tf, lucid, np) so it can be a number of things. I tried going back but it doesn't seem to work.

from lucid.

JaspervDalen avatar JaspervDalen commented on May 17, 2024

just for extra information:
I also tried to visualize the network using this method: https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
The same problem is true for higher layers.

from lucid.

ngonthier avatar ngonthier commented on May 17, 2024

@JaspervDalen Did you manage to solve this problem ? I am facing the same one by importing a graph from Keras to Lucid.

from lucid.

colah avatar colah commented on May 17, 2024

When you visualize a ReLU neuron, it's possible that your initial image causes the neuron to take a zero value. When ReLU neurons are zero, no gradient flows through them, and so they never start to optimize.

The solution is to use a gradient override for the first few steps:
https://github.com/tensorflow/lucid/blob/master/lucid/misc/redirected_relu_grad.py

There are some ways this could still fail to work, including:

  • Your neuron uses a non-standard implementation of relu (ie. not tf.nn.relu) and so doesn't get overriden
  • You are visualizing a non-relu neuron that can also have zero gradients (like a saturated sigmoid neuron)
  • The gradient override was only applied for a few steps, and not long enough to escape

from lucid.

ngonthier avatar ngonthier commented on May 17, 2024

Thanks for your answer. The problem doesn't seem to be related to the no gradient flows in my case but maybe to the fact I convert a keras model to a tensorflow graph and then to a lucid Model in order to run the render function.

Because, the problem est more that the filter doesn't respond as I expected it. I try to get the same results with Inception_v1 as in the original paper Feature Visualization.

from lucid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.