Coder Social home page Coder Social logo

Comments (14)

elggem avatar elggem commented on July 3, 2024 2

I can confirm that using a linear layer and scaled data, global loss decreases over time as expected! On visualization of the filters I still get essentially random noise (rmse shows nice detectors) and loss gets stuck at around 5.0, but that must be a problem with either the visualization or parameters. I am aiming for filters like the ones in the paper you cite in Readme (Vincent et al. 2010) on page 3390.

For now I consider this issue as closed, thank you for the pointer in the right direction, it is much appreciated! 👍

from libsdae-autoencoder-tensorflow.

Nilabhra avatar Nilabhra commented on July 3, 2024 1

Hello,

Can you try the following and report your observations here?

  • A softmax activation almost never works for a neural network except for the output layer, see if 'relu', 'tanh' or 'sigmoid' resolves the issue.
  • Add another layer to the model (preferable size of 50 or more units)
  • Try a smaller learning rate, 0.0001 should do just fine.
  • For a cross entropy loss, generally, the noise which is added is masking or salt and pepper noise. You can try noise='mask' and see if that resolves the issue.

Please let us know if any of the above worked, if not, I am still confident we can resolve the issue.

from libsdae-autoencoder-tensorflow.

rajarsheem avatar rajarsheem commented on July 3, 2024 1

@elggem: @Nilabhra has pushed a commit . Hope it solves your issue. Get back to us in case of any help.

from libsdae-autoencoder-tensorflow.

elggem avatar elggem commented on July 3, 2024

Hi, thank you for the quick reply! 😃

I did try different activations before, the loss is different but it never converges towards zero unfortunately. I didn't try 'tanh' before, but now I tried and it's the same behaviour.

I don't see how adding another layer can alleviate the problem, because learning is done layer-by-layer, however if I add a new layer the same problem is present in that one, too.

Here's the code I tried now with the output:

model = StackedAutoEncoder(
    dims=[100, 50],
    activations=['sigmoid', 'sigmoid'], 
    noise='mask-0.4', 
    epoch=[1000, 500],
    loss='cross-entropy',
    lr=0.0001,
    batch_size=150,
    print_step=100
)
Layer 1
epoch 99: global loss = 1721.69958496
epoch 199: global loss = 1757.13647461
epoch 299: global loss = 1790.83044434
epoch 399: global loss = 1825.85583496
epoch 499: global loss = 1856.59729004
epoch 599: global loss = 1880.59521484
epoch 699: global loss = 1897.80639648
epoch 799: global loss = 1909.16601562
epoch 899: global loss = 1917.97753906
epoch 999: global loss = 1919.03967285
Layer 2
epoch 99: global loss = 1077.55822754
epoch 199: global loss = 1094.78894043
epoch 299: global loss = 1113.25756836
epoch 399: global loss = 1132.03845215
epoch 499: global loss = 1149.70874023

It does work with rmse, that's the weird part! Thanks again! :)

from libsdae-autoencoder-tensorflow.

Nilabhra avatar Nilabhra commented on July 3, 2024

This is weird indeed. Can you confirm if all of your data points are in the range [0, 1]?

from libsdae-autoencoder-tensorflow.

elggem avatar elggem commented on July 3, 2024

Yes, I'm using the MNIST import from Tensorflow, it's float32's from 0.0 to 1.0. Here's the full code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from model import StackedAutoEncoder
import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('data/mnist', one_hot=True)

import numpy as np
import model.utils as utils
from os.path import join as pjoin

model = StackedAutoEncoder(
    dims=[100],
    activations=['sigmoid'], 
    noise='mask-0.4', 
    epoch=[1000],
    loss='cross-entropy',
    lr=0.0001,
    batch_size=150,
    print_step=100
)

a = model.fit_transform(mnist.train.images)
print(a)

from libsdae-autoencoder-tensorflow.

elggem avatar elggem commented on July 3, 2024

Oh and:

In [2]: mnist.train.images.shape
Out[2]: (55000, 784)

from libsdae-autoencoder-tensorflow.

Nilabhra avatar Nilabhra commented on July 3, 2024

I'll try to replicate this and come up with a fix as soon as I can. In the
mean time, can you confirm if the problem persists without noise?

On Tue 8 Nov, 2016, 8:40 PM Ralf Mayet, [email protected] wrote:

Yes, I'm using the MNIST import from Tensorflow, it's float32's from 0.0
to 1.0. Here's the full code:

#!/usr/bin/env python

-- coding: utf-8 --

from model import StackedAutoEncoder
import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('data/mnist', one_hot=True)

import numpy as np
import model.utils as utils
from os.path import join as pjoin

model = StackedAutoEncoder(
dims=[100],
activations=['sigmoid'],
noise='mask-0.4',
epoch=[1000],
loss='cross-entropy',
lr=0.0001,
batch_size=150,
print_step=100
)

a = model.fit_transform(mnist.train.images)
print(a)


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFL5YLZ4YJWAyY-JWKAk7O4usPPGjMlbks5q8JDWgaJpZM4KsdSO
.

from libsdae-autoencoder-tensorflow.

elggem avatar elggem commented on July 3, 2024

Yes, just tried without noise and it still persists (actually loss starts climbing more quickly...).

Thank you so much for your support, if you have any pointer or advice that would be really helpful. 👍

I also tried this as loss, but it yielded NaN results:

loss = tf.reduce_mean(-tf.reduce_sum(x_ * tf.log(decoded)))

from libsdae-autoencoder-tensorflow.

Nilabhra avatar Nilabhra commented on July 3, 2024

I have personally can confirm that an autoencoder works for MNIST under
rmse loss. I never did check MNIST for cross entropy. I'll check if there
is a bug in the code. If you want, you can check the code yourself for
possible bugs.

On Tue 8 Nov, 2016, 8:53 PM Ralf Mayet, [email protected] wrote:

Yes, just tried without noise and it still persists (actually loss starts
climbing more quickly...).

Thank you so much for your support, if you have any pointer or advice that
would be really helpful. 👍


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFL5YH8iTtZphVwCtESuio-bG5pNAbflks5q8JPagaJpZM4KsdSO
.

from libsdae-autoencoder-tensorflow.

elggem avatar elggem commented on July 3, 2024

Absolutely! Im gonna keep investigating and keep you posted. Thanks for your help and don't worry if you can't spend anymore time on this. :)

from libsdae-autoencoder-tensorflow.

Nilabhra avatar Nilabhra commented on July 3, 2024

I think I spotted the problem. The cross entropy loss is computed with softmax_cross_entropy_loss(). Possible work arounds are:

  • Scale each row of your data so that the row sum becomes equal to 1. And use a linear layer.
  • Write a correct version of the cross entropy function.

I will fix this later tomorrow.
Thanks for bringing this to our notice.

from libsdae-autoencoder-tensorflow.

elggem avatar elggem commented on July 3, 2024

Hi @Nilabhra!

One more follow-up on the issue, as I couldn't get cross-entropy to work even with your fix: I've noticed in line 111 in stacked_autoencoder.py it says:

decoded = tf.matmul(encoded, decode['weights']) + decode['biases']

That means the decoding operation is not influenced by the activation function. In the publication you cite on your repository I found the following passage: "[...] an affine+sigmoid encoder and either affine decoder with squared error loss or affine+sigmoid decoder with cross-entropy loss." Thus I changed that line to use the same activation function as the encoder does, and, using either one of the loss functions we tried above I now get losses that do not explode beyond NaN very quickly. I'm still not getting sensible weights with this it seems, even though loss decreases, however it might be a step in the right direction. You don't need to take any action, just keeping you posted.

I've extended and reused parts of your code in my implementation of a DeepLearning architecture here, I hope that's OK! If you have any suggestions they are more than welcome. Thank you for your support on this issue and great work! 👍

from libsdae-autoencoder-tensorflow.

Nilabhra avatar Nilabhra commented on July 3, 2024

You are correct, we deviated from the publication by not introducing a non-linearity in the decoding step and also by tying the encoding and decoding weight matrix. Our decision was based on our observations on the experiments that we did.

For cross entropy loss you can try the following:

  1. Tied weights with sigmoid activation for both the decoding and the encoding phase.
  2. Separate weights with sigmoid activation for both the decoding and the encoding phase.

I suggest not to use softmax, incase you are doing so.

from libsdae-autoencoder-tensorflow.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.