When i try to use lucid on my own network the output of a lot of filters is just gray

an example can be found in: <a href="https://colab.research.google.com/drive/1OFesr3ce

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

after some layers the image given as output by render.render_vis become grey about lucid HOT 13 CLOSED

tensorflow commented on May 17, 2024

after some layers the image given as output by render.render_vis become grey

from lucid.

Comments (13)

ludwigschubert commented on May 17, 2024

Are you optimizing a tensor behind a ReLU by any chance?

from lucid.

JaspervDalen commented on May 17, 2024

at some places however not everywhere. I am trying to make spritemaps for every layer and at the moment I am doing it for every layer which does not give an error (no gradient) this includes the ones behind a ReLU. However these are not the only ones. But is it not possible to optimise behind a Relu when you get deeper in a network? At the starting layers it does give correct outputs.

from lucid.

JaspervDalen commented on May 17, 2024

ah I found this:

One isse you will run into is that certain neurons you want to inspect won't activate from just the random noise you feed them, and if they don't activate then a ReLU non-linearity will block all the gradient, so you won't be able to optimize your input.

There are two ways around this: override non-linearity gradients to the identity op for the first couple optimization steps, or create new tensors in your graph that correspond to the tensors you want to optimize, but before their ReLU non-linearity.

We don't have either of those options automated so far. Look at lucid's googlenet/inception class to see how we "crawl" the graph and add _pre_relu nodes for tensors we're interested in.

I tried to see how you added pre_relu nodes which I think you do here. However this does not seem to work for mobilenets. It is mainly because of this tower = tower.op.inputs[0], I am not sure what inputs[0] should be as the only place that includes op.type==Relu is inputs[0] in mobilenet v1.

from lucid.

JaspervDalen commented on May 17, 2024

I also wondered for how many iteration do you run the render_vis in order to create the spritemaps you have? Becaues the default is 512 which seems to few.

from lucid.

ludwigschubert commented on May 17, 2024

I just realized you're likely on the most current version, which does try its best to automatically override ReLU gradients during model import anyway—so that shouldn't even be the problem. (For reference: it's the relu_gradient_override=True parameter on render_vis.)

Number of iterations is just a hyperparameter—in Feature Visualization we reported 2048 steps, but this will depend on your architecture, optimizer, step size, parameterization, etc. Feel free to tune it.

On a more general level note that interpretability techniques atm have a big uncertainty around them when they don't work as expected: is the technique not working, or is the model truly behaving in weird ways? Progress on this is an open research problem and feels outside the scope of a Github issue. I'd always be happy to take a look at an easy reproduction in (e.g.) a colab notebook. :-)

from lucid.

JaspervDalen commented on May 17, 2024

an example can be found in: https://colab.research.google.com/drive/1OFesr3ceAaPGmU_gNtDy1djn_9GZgI81

from lucid.

JaspervDalen commented on May 17, 2024

These are some images from the layers, As can be seen they become more and more grey and more iterations or other variables don't seem te work for me.

from lucid.

ludwigschubert commented on May 17, 2024

Hey @JaspervDalen I took a long look—I could not find anything obviously wrong in your approach.
Our code has not been (extensively) tested on models with BatchNorm layers, and we haven't tested if graphs imported from keras may behave subtly differently. We will follow-up on both, but I unfortunately can't guarantee a timeline.

If you want to debug more yourself, you may find it useful to start by finding out why so little gradient reaches the input during optimization. Sorry I can't be of more help atm!

(I am skeptical of your image_value_range; it looks to me like Keras preprocesses it to (-1, 1) for tf models. This does not seem to solve your problem, though.)

from lucid.

JaspervDalen commented on May 17, 2024

Ok thank you, the image_value_range is an artifact from my trying different ways of using lucid as I at the start when playing a bit with your library managed to get outputs in these layers. However since then I updated every lib (keras,tf, lucid, np) so it can be a number of things. I tried going back but it doesn't seem to work.

from lucid.

JaspervDalen commented on May 17, 2024

just for extra information:
I also tried to visualize the network using this method: https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
The same problem is true for higher layers.

from lucid.

ngonthier commented on May 17, 2024

@JaspervDalen Did you manage to solve this problem ? I am facing the same one by importing a graph from Keras to Lucid.

from lucid.

colah commented on May 17, 2024

When you visualize a ReLU neuron, it's possible that your initial image causes the neuron to take a zero value. When ReLU neurons are zero, no gradient flows through them, and so they never start to optimize.

The solution is to use a gradient override for the first few steps:
https://github.com/tensorflow/lucid/blob/master/lucid/misc/redirected_relu_grad.py

There are some ways this could still fail to work, including:

Your neuron uses a non-standard implementation of relu (ie. not tf.nn.relu) and so doesn't get overriden
You are visualizing a non-relu neuron that can also have zero gradients (like a saturated sigmoid neuron)
The gradient override was only applied for a few steps, and not long enough to escape

from lucid.

ngonthier commented on May 17, 2024

Thanks for your answer. The problem doesn't seem to be related to the no gradient flows in my case but maybe to the fact I convert a keras model to a tensorflow graph and then to a lucid Model in order to run the render function.

Because, the problem est more that the filter doesn't respond as I expected it. I try to get the same results with Inception_v1 as in the original paper Feature Visualization.

from lucid.

after some layers the image given as output by render.render_vis become grey about lucid HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent