Comments (8)
Update - the adam optimizer was fixed yesterday - it works now, with default parameters on the Rosenbrock test problem.
I'm testing it now to see if gives an improvement?
from vae-torch.
Great! Curious to see your result.
from vae-torch.
Hi Joost,
unfortunately I have'nt managed to get any convergence yet with adam, over a range of different config parameters?
Looking at figure 4 a) of the adam paper, it seems that
beta_1 = 0.1
beta_2 = 0.0001
alpha = 0.002
epsilon = 1e-8
lambda = 1 - 1e-8
with the model they state,
dim_hidden = 50
hidden_units_encoder = 500
hidden_units_decoder = 500
should get convergence after after 10 epochs, but I can't reproduce their results?
from vae-torch.
Clearly the result is much better after 100 epochs (4 b) so those figures do not indicate convergence. It shows the value of the negLL after a set amount of epochs for different learning rates (x-axis) and illustrates the necessity of the bias-correction factor.
I am not sure exactly how many epochs are necessary with Adam, I never tested that.
from vae-torch.
Hi, yes sorry I was sloppy with my language, applogies.
Basically I've tried all the grid points they mention in their paper, i.e. the bias correction terms beta_1 and beta_2 and learning rate - but I still get the negLL after 10 epochs to be about
-4 e+155
i.e. basically -infinity. Maybe you want to give it try? If you pull the latest optim
module
luarocks install optim
I think its then just a task of
i) changing your while loop to a for loop, running for 10 epochs, and writing the last negLL to a table
ii) Constructing a small grid and iterating your above code over the grid points.
If you really want to be fancy, there's a Bayesian optimization pakage on git called spearmint, which uses a fancy iterative Gaussian process scheme for continuous hyper-parameter optimization - its coded in python. I'm trying it now.
I'm guessing if we both can't get it to work, it might be a problem with the adam.lua
code?
from vae-torch.
I'm not sure the adam.lua
code in the optim
package works.
Maybe it's best to try to use Dirk Kingma's adam implementation here,
https://github.com/dpkingma/nips14-ssl/blob/master/adam.py
with your theano implementation?
from vae-torch.
I have tried adam before (in Theano) and it works nicely, so this does indeed sound like a bug, either in my code or in the adam implementation.
I will take a look at it tomorrow.
from vae-torch.
Fixed by setting negative learning rate (gradient ascent)
from vae-torch.
Related Issues (15)
- [Feature request] Full variational Bayes HOT 2
- Error ! HOT 2
- Why doing criterion.sizeAverage = true makes everything not work? HOT 3
- Does this project include image generation code? HOT 1
- GaussianCriterion gradient issue HOT 1
- how much time it takes.... HOT 5
- Positive Lowerbound? HOT 1
- NLL number after change of dataset? HOT 3
- typo? HOT 1
- Use of exp in the Reparametrization HOT 21
- Questions about GaussianCriterion.lua HOT 1
- divide loss by batch size HOT 1
- Inputs to KLD criterion are mean and variance? HOT 2
- lower bond? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vae-torch.