Comments (2)
So the idea of algorithm 2 is to perform variational inference on the parameters of the decoder too.
To implement this, it is necessary to create a new module that reparametrizes theta and then sets the parameters of the decoding layer or perhaps add ΞΆ as parameter to each module in the decoding layer. This would not be a trivial change to the code and I have my doubts if it will improve the negLL by much. Feel free to submit a PR and I will help you out where necessary.
from vae-torch.
Hi Joost,
thanks alot for the helpfull reply!
I am working on this now, there's some nicely documented theano code which does what you describe. It's not so bad as I have some background in stochastics/Monte Carlo. I'll a submit a PR in the next few days.
Pylearn2 has'nt implemented the adam optimizer yet, so that Theano code only gets negLL of about 94. So it seems that stochastic sampling (full VB), gets an improvment of 7, and adam gets a further improvement of about 7.
There's a new paper DRAW, which uses a recurrent encoder network to selectively read patches of the input, and a recurrent decoder network to selectively write/deposit probability mass to regions of a canvas matrix. The canvas matrix can then be used to generate/reconstruct output. They state they get a negLL of 81 - which is the world best at the moment. Without selective read/write the DRAW model is basically the same as the Welling/Kingma model, except with LTSM encoder/decoder networks, and it gets 87 negLL.
With your experience with your Variational Recurrent Auto-Encoders work, you have much of the background needed to produce the DRAW result? So perhaps we could work on this together, if you are interested?
Coding a variational recurrent auto-encoder which uses importance sampling to perform VI on the parameters of both the encoder/read mechanism and decoder/write mechanism seems like the first step in building this?
Feel free to email me if you like?
All the best,
Aj
from vae-torch.
Related Issues (15)
- Error ! HOT 2
- Why doing criterion.sizeAverage = true makes everything not work? HOT 3
- Does this project include image generation code? HOT 1
- GaussianCriterion gradient issue HOT 1
- how much time it takes.... HOT 5
- Positive Lowerbound? HOT 1
- NLL number after change of dataset? HOT 3
- typo? HOT 1
- config for Adam HOT 8
- Use of exp in the Reparametrization HOT 21
- Questions about GaussianCriterion.lua HOT 1
- divide loss by batch size HOT 1
- Inputs to KLD criterion are mean and variance? HOT 2
- lower bond? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vae-torch.