Comments (1)
Nevermind, I got it:
checking section 3.2 of the paper, paragraph: Residual Normal Distributions.
In the code self.mu
, self.sigma
are the parameters of the posterior distribution
prior.mu
, prior.sigma
are the parameters of the prior distribution
p(z_i|z_{l<i}) is the prior, defined as N(μ_p, σ_p) where both params are conditioned on all z_{l<i}
q(z_i|z_{l<i}, x) is the distribution from the encoder (self), defined as:
q = N(μ_p + Δμ_q, σ_p * Δσ_q), where Δμ_q, Δσ_q are the relative shift and scale given by the hierarchical nature of the distribution.
So basically, self.mu
and self.sigma
are the parameters of the posterior:
self.mu = μ_p + Δμ_q
self.sigma = σ_p * Δσ_q
The KL Loss between two normal distributions a = N(μ_1, σ_1), b = N(μ_2, σ_2) is given by:
0.5 [ (μ_2 - μ_1)**2 / σ_2**2 ] + 0.5 (σ_1**2 / σ_2**2) - 0.5 [ln(σ_1**2 / σ_2**2)] - 0.5
proof: https://statproofbook.github.io/P/norm-kl.html
In our case: μ_1 = self.mu; μ_2 = prior.mu; σ_1 = self.sigma; σ_2 = prior.sigma
So the three terms in the formula above become:
1. 0.5 [ (μ_p - μ_p + Δμ_q)**2 / σ_p**2] = 0.5 [ Δμ_q**2 / σ_p**2]
2. 0.5 ((σ_p * Δσ_q)**2 / σ_p**2) = 0.5 [Δσ_q**2]
3. 0.5 [ln((σ_p * Δσ_q)**2 / σ_p**2)] = 0.5 ln(Δσ_q**2)
The final formula is thus the one written in Equation 2 and (in the code):
Δμ_q = self.mu - prior.mu
Δσ_q = self.sigma / prior.sigma
from nvae.
Related Issues (20)
- FID score of CelebA-HQ 256x256 HOT 1
- NomalDecoder & num_bits
- TypeError: batch_norm_backward_elemt() missing 1 required positional arguments: "count" HOT 1
- How to run without using parallelization? HOT 1
- Can you provide pretrained models? HOT 1
- why is there self.prior_ftr0 in the decoder model?
- Why some of the generate images by the official checkpoint of CelebA64 are NaN-value? HOT 2
- Query: CelebA HQ 256
- Query: Dataset CelebA-HQ 256x256 issue
- Query: FFHQ Pre-Processing HOT 3
- FFHQ Training
- CelebA-HQ 256x256 Data Pre-processing HOT 1
- Possible typo in the log_p() function
- ImageNet Checkpoint
- Question regarding traversing the latent space
- Why output for 3-rd channel is unused in Logistic mixture? HOT 1
- how can i use the code on my own dataset. if it's necessary to modify the code carefully myself? HOT 1
- "arch_instance" argument
- Problem while converting tfrecord to lmdb data AttributeError: 'bytes' object has no attribute 'cpu' HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nvae.