Coder Social home page Coder Social logo

config for Adam about vae-torch HOT 8 CLOSED

y0ast avatar y0ast commented on July 21, 2024
config for Adam

from vae-torch.

Comments (8)

AjayTalati avatar AjayTalati commented on July 21, 2024

Update - the adam optimizer was fixed yesterday - it works now, with default parameters on the Rosenbrock test problem.

I'm testing it now to see if gives an improvement?

from vae-torch.

y0ast avatar y0ast commented on July 21, 2024

Great! Curious to see your result.

from vae-torch.

AjayTalati avatar AjayTalati commented on July 21, 2024

Hi Joost,

unfortunately I have'nt managed to get any convergence yet with adam, over a range of different config parameters?

Looking at figure 4 a) of the adam paper, it seems that

beta_1 = 0.1
beta_2 = 0.0001
alpha = 0.002
epsilon = 1e-8
lambda = 1 - 1e-8

with the model they state,

dim_hidden = 50
hidden_units_encoder = 500
hidden_units_decoder = 500

should get convergence after after 10 epochs, but I can't reproduce their results?

from vae-torch.

y0ast avatar y0ast commented on July 21, 2024

Clearly the result is much better after 100 epochs (4 b) so those figures do not indicate convergence. It shows the value of the negLL after a set amount of epochs for different learning rates (x-axis) and illustrates the necessity of the bias-correction factor.

I am not sure exactly how many epochs are necessary with Adam, I never tested that.

from vae-torch.

AjayTalati avatar AjayTalati commented on July 21, 2024

Hi, yes sorry I was sloppy with my language, applogies.

Basically I've tried all the grid points they mention in their paper, i.e. the bias correction terms beta_1 and beta_2 and learning rate - but I still get the negLL after 10 epochs to be about

-4 e+155

i.e. basically -infinity. Maybe you want to give it try? If you pull the latest optim module

luarocks install optim

I think its then just a task of

i) changing your while loop to a for loop, running for 10 epochs, and writing the last negLL to a table

ii) Constructing a small grid and iterating your above code over the grid points.

If you really want to be fancy, there's a Bayesian optimization pakage on git called spearmint, which uses a fancy iterative Gaussian process scheme for continuous hyper-parameter optimization - its coded in python. I'm trying it now.

I'm guessing if we both can't get it to work, it might be a problem with the adam.lua code?

from vae-torch.

AjayTalati avatar AjayTalati commented on July 21, 2024

I'm not sure the adam.lua code in the optim package works.

Maybe it's best to try to use Dirk Kingma's adam implementation here,

https://github.com/dpkingma/nips14-ssl/blob/master/adam.py

with your theano implementation?

from vae-torch.

y0ast avatar y0ast commented on July 21, 2024

I have tried adam before (in Theano) and it works nicely, so this does indeed sound like a bug, either in my code or in the adam implementation.

I will take a look at it tomorrow.

from vae-torch.

y0ast avatar y0ast commented on July 21, 2024

Fixed by setting negative learning rate (gradient ascent)

from vae-torch.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.