probabilists / lampe Goto Github PK
View Code? Open in Web Editor NEWLikelihood-free AMortized Posterior Estimation with PyTorch
Home Page: https://lampe.readthedocs.io
License: MIT License
Likelihood-free AMortized Posterior Estimation with PyTorch
Home Page: https://lampe.readthedocs.io
License: MIT License
Hi,
Thanks for the wonderful toolkit for simulation-based inference. I am learning it and found it very helpful with my work.
I took a look at the examples in tutorial and some functions, I found that all examples dealt with the static data with vector shape. I wonder if the toolkit is able to handle time-series data, such as data with a shape of m by n by k, m denotes the number of datasets, n denotes the time steps, and k denotes the number of features.
Best,
Jice
When computing the log probability with FMPE's log_prob method, the resulting probability values depend on the other input elements in the batch. The change I saw was in the order of the third or fourth decimal place.
In any case, thanks already a lot for your work on LAMPE
Following the example, the two ways to compute log probabilities for a given configuration theta
and batch of corresponding simulated results x
produce different results:
from itertools import islice
import torch
import torch.nn as nn
import torch.optim as optim
import zuko
from lampe.data import JointLoader
from lampe.inference import FMPE, FMPELoss
from lampe.utils import GDStep
from tqdm import tqdm
LABELS = [r"$\theta_1$", r"$\theta_2$", r"$\theta_3$"]
LOWER = -torch.ones(3)
UPPER = torch.ones(3)
prior = zuko.distributions.BoxUniform(LOWER, UPPER)
def simulator(theta: torch.Tensor) -> torch.Tensor:
x = torch.stack(
[
theta[..., 0] + theta[..., 1] * theta[..., 2],
theta[..., 0] * theta[..., 1] + theta[..., 2],
],
dim=-1,
)
return x + 0.05 * torch.randn_like(x)
theta = prior.sample()
x = simulator(theta)
loader = JointLoader(prior, simulator, batch_size=256, vectorized=True)
estimator = FMPE(3, 2, hidden_features=[64] * 5, activation=nn.ELU)
loss = FMPELoss(estimator)
optimizer = optim.AdamW(estimator.parameters(), lr=1e-3)
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, 128)
step = GDStep(optimizer, clip=1.0) # gradient descent step with gradient clipping
estimator.train()
with tqdm(range(128), unit="epoch") as tq:
for epoch in tq:
losses = torch.stack(
[
step(loss(theta, x))
for theta, x in islice(loader, 256) # 256 batches per epoch
]
)
tq.set_postfix(loss=losses.mean().item())
scheduler.step()
theta_star = prior.sample()
X = torch.stack([simulator(theta_star) for _ in range(10)])
estimator.eval()
with torch.no_grad():
# e.g. [3.1956, 1.8184, 2.4533, 1.6461, 3.0488, 2.5868, 2.7055, 2.7679, 3.3405, 1.5554]
log_p_one_batch = estimator.flow(X).log_prob(theta_star.repeat(len(X), 1))
# e.g. [3.1978, 1.8175, 2.4526, 1.6468, 3.0495, 2.5894, 2.7065, 2.7712, 3.3385, 1.5558]
log_p_individual = [estimator.flow(x).log_prob(theta_star) for x in X]
I would expect that the individual log probability values for one theta
and x
pair are not affected by the other entries in the X
batch.
This is corroborated by the official implementation not showing that behaviour when evaluating log_prob_batch
with different subsets for the batch.
In the above example, I would expect both to e.g. result in [3.1978, 1.8175, 2.4526, 1.6468, 3.0495, 2.5894, 2.7065, 2.7712, 3.3385, 1.5558]
.
I have no clear intuition why that would be the case. I suspected a stochastic influence and that the FreeFormJacobianTransform
exact mode might help, but it seems to be a deterministic difference and settings exact=true
did not affect that accordingly.
I noticed that the LAMPE implementation utilizes a trigonometrical embedding of the time dimension for the vector field computation when the official implementation by the authors does not, but it's also not obvious to me that this would explain the difference.
Given the recent popularity of score-based generative modeling, it would be great to provide a score-based inference algorithm within LAMPE.
A NSE
(neural score estimation) class, a NSELoss
module and a way to sample from the trained score estimator should be provided. Access to the log-density (through the probability flow ODE) is not mandatory, but would be convenient.
Deep Unsupervised Learning using Nonequilibrium Thermodynamics (Sohl-Dickstein et al., 2015)
https://arxiv.org/abs/1503.03585
Generative Modeling by Estimating Gradients of the Data Distribution (Song et al., 2019)
https://arxiv.org/abs/1907.05600
Denoising Diffusion Probabilistic Models (Ho et al., 2020)
https://arxiv.org/abs/2006.11239
Score-Based Generative Modeling through Stochastic Differential Equations (Song et al., 2021)
https://arxiv.org/abs/2011.13456
Implementation of Balanced Neural Ratio Estimation (BNRE) as introduced in "Towards Reliable Simulation-Based Inference with Balanced Neural Ratio Estimation" (Delaunoy et al., 2022)
An implementation of the BNRE loss similar to the NRELoss
None
The documentation of the build
constructor argument in NRE/NPE/NSE specifies
build: Callable[[int, int], nn.Module] = MLP
and
build: The network constructor
In my opinion, this does not really help in understanding how this constructor argument should be used, as there is no mention of what the ints are referring to. It is also not clear what the nn.Module
should expect as inputs and produce as outputs.
In the tutorial about embeddings, it could also be helpful to explain how the network produced with build
is different from the network used to build an embedding.
The method described in Calibrating Neural Simulation-Based Inference with Differentiable Coverage Probability paper.
New loss class in the inference
module, similar to the NRELoss ->BNRELoss enhancement.
Two separate classes, one for NRE, and one for NPE.
I did not consider alternatives.
When the domain is one-dimensional, lampe.utils.gridapply
builds a grid of shape (bins,)
instead of the expected (bins, 1)
.
In the following error, mat1
is x
and should be of shape (128, 1)
.
>>> import torch
>>> import lampe
>>> A = torch.randn(1, 3)
>>> f = lambda x: x @ A
>>> domain = torch.zeros(1), torch.ones(1)
>>> lampe.utils.gridapply(f, domain)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/francois/Documents/Git/lampe/lampe/utils.py", line 104, in gridapply
y = [f(x) for x in grid.split(batch_size)]
File "/home/francois/Documents/Git/lampe/lampe/utils.py", line 104, in <listcomp>
y = [f(x) for x in grid.split(batch_size)]
File "<stdin>", line 1, in <lambda>
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x128 and 1x3)
The grid should be of shape (bins, 1)
.
The lampe.utils.gridapply
function uses torch.cartesian_prod
, which behaves inconsistently when given a single argument. A reshape
should be enough to fix the issue.
The simulator tutorial references JointDataset
, which does not exist.
It should probably be updated to IterableJointDataset
or JointLoader
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.