Coder Social home page Coder Social logo

probabilists / lampe Goto Github PK

View Code? Open in Web Editor NEW
107.0 8.0 8.0 3.77 MB

Likelihood-free AMortized Posterior Estimation with PyTorch

Home Page: https://lampe.readthedocs.io

License: MIT License

Python 100.00%
bayesian density-estimation density-ratio-estimation inference likelihood-free-inference normalizing-flows probability simulation-based-inference python pytorch

lampe's Introduction

LAMPE's banner

LAMPE

LAMPE is a simulation-based inference (SBI) package that focuses on amortized estimation of posterior distributions, without relying on explicit likelihood functions; hence the name Likelihood-free AMortized Posterior Estimation (LAMPE). The package provides PyTorch implementations of modern amortized simulation-based inference algorithms like neural ratio estimation (NRE), neural posterior estimation (NPE) and more. Similar to PyTorch, the philosophy of LAMPE is to avoid obfuscation and expose all components, from network architecture to optimizer, to the user such that they are free to modify or replace anything they like.

As part of the inference pipeline, lampe provides components to efficiently store and load data from disk, diagnose predictions and display results graphically.

Installation

The lampe package is available on PyPI, which means it is installable via pip.

pip install lampe

Alternatively, if you need the latest features, you can install it from the repository.

pip install git+https://github.com/probabilists/lampe

Documentation

The documentation is made with Sphinx and Furo and is hosted at lampe.readthedocs.io.

Contributing

If you have a question, an issue or would like to contribute, please read our contributing guidelines.

lampe's People

Contributors

adelau avatar bkmi avatar francois-rozet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lampe's Issues

Implement a score-based inference algorithm

Description

Given the recent popularity of score-based generative modeling, it would be great to provide a score-based inference algorithm within LAMPE.

Interface

A NSE (neural score estimation) class, a NSELoss module and a way to sample from the trained score estimator should be provided. Access to the log-density (through the probability flow ODE) is not mandatory, but would be convenient.

References

Implement Balanced Neural Ratio Estimation loss

Description

Implementation of Balanced Neural Ratio Estimation (BNRE) as introduced in "Towards Reliable Simulation-Based Inference with Balanced Neural Ratio Estimation" (Delaunoy et al., 2022)

Implementation

An implementation of the BNRE loss similar to the NRELoss

Alternatives

None

Incorrect grid shape in `utils.gridapply` for one-dimensional space

Description

When the domain is one-dimensional, lampe.utils.gridapply builds a grid of shape (bins,) instead of the expected (bins, 1).

Reproduce

In the following error, mat1 is x and should be of shape (128, 1).

>>> import torch
>>> import lampe
>>> A = torch.randn(1, 3)
>>> f = lambda x: x @ A
>>> domain = torch.zeros(1), torch.ones(1)
>>> lampe.utils.gridapply(f, domain)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/francois/Documents/Git/lampe/lampe/utils.py", line 104, in gridapply
    y = [f(x) for x in grid.split(batch_size)]
  File "/home/francois/Documents/Git/lampe/lampe/utils.py", line 104, in <listcomp>
    y = [f(x) for x in grid.split(batch_size)]
  File "<stdin>", line 1, in <lambda>
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x128 and 1x3)

Expected behavior

The grid should be of shape (bins, 1).

Causes and solution

The lampe.utils.gridapply function uses torch.cartesian_prod, which behaves inconsistently when given a single argument. A reshape should be enough to fix the issue.

Environment

  • LAMPE version: 0.6.1
  • PyTorch version: 1.12.0
  • Python version: 3.9.15
  • OS: Ubuntu 22.10

Differences in batched vs. non-batched FMPE log_prob

Description

When computing the log probability with FMPE's log_prob method, the resulting probability values depend on the other input elements in the batch. The change I saw was in the order of the third or fourth decimal place.

In any case, thanks already a lot for your work on LAMPE ☺️

Reproduce

Following the example, the two ways to compute log probabilities for a given configuration theta and batch of corresponding simulated results x produce different results:

from itertools import islice

import torch
import torch.nn as nn
import torch.optim as optim
import zuko
from lampe.data import JointLoader
from lampe.inference import FMPE, FMPELoss
from lampe.utils import GDStep
from tqdm import tqdm

LABELS = [r"$\theta_1$", r"$\theta_2$", r"$\theta_3$"]
LOWER = -torch.ones(3)
UPPER = torch.ones(3)

prior = zuko.distributions.BoxUniform(LOWER, UPPER)


def simulator(theta: torch.Tensor) -> torch.Tensor:
    x = torch.stack(
        [
            theta[..., 0] + theta[..., 1] * theta[..., 2],
            theta[..., 0] * theta[..., 1] + theta[..., 2],
        ],
        dim=-1,
    )

    return x + 0.05 * torch.randn_like(x)


theta = prior.sample()
x = simulator(theta)

loader = JointLoader(prior, simulator, batch_size=256, vectorized=True)

estimator = FMPE(3, 2, hidden_features=[64] * 5, activation=nn.ELU)

loss = FMPELoss(estimator)
optimizer = optim.AdamW(estimator.parameters(), lr=1e-3)
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, 128)
step = GDStep(optimizer, clip=1.0)  # gradient descent step with gradient clipping

estimator.train()

with tqdm(range(128), unit="epoch") as tq:
    for epoch in tq:
        losses = torch.stack(
            [
                step(loss(theta, x))
                for theta, x in islice(loader, 256)  # 256 batches per epoch
            ]
        )

        tq.set_postfix(loss=losses.mean().item())

        scheduler.step()


theta_star = prior.sample()
X = torch.stack([simulator(theta_star) for _ in range(10)])

estimator.eval()

with torch.no_grad():
    # e.g. [3.1956, 1.8184, 2.4533, 1.6461, 3.0488, 2.5868, 2.7055, 2.7679, 3.3405, 1.5554]
    log_p_one_batch = estimator.flow(X).log_prob(theta_star.repeat(len(X), 1))

    # e.g. [3.1978, 1.8175, 2.4526, 1.6468, 3.0495, 2.5894, 2.7065, 2.7712, 3.3385, 1.5558]
    log_p_individual = [estimator.flow(x).log_prob(theta_star) for x in X]

Expected behavior

I would expect that the individual log probability values for one theta and x pair are not affected by the other entries in the X batch.
This is corroborated by the official implementation not showing that behaviour when evaluating log_prob_batch with different subsets for the batch.

In the above example, I would expect both to e.g. result in [3.1978, 1.8175, 2.4526, 1.6468, 3.0495, 2.5894, 2.7065, 2.7712, 3.3385, 1.5558].

Causes and solution

I have no clear intuition why that would be the case. I suspected a stochastic influence and that the FreeFormJacobianTransform exact mode might help, but it seems to be a deterministic difference and settings exact=true did not affect that accordingly.
I noticed that the LAMPE implementation utilizes a trigonometrical embedding of the time dimension for the vector field computation when the official implementation by the authors does not, but it's also not obvious to me that this would explain the difference.

Environment

  • LAMPE version: 0.8.2
  • PyTorch version: 2.3.0
  • Python version: 3.10.13
  • OS: Ubuntu 20.04.6 LTS

Deal with time-dependent or time-series data

Hi,
Thanks for the wonderful toolkit for simulation-based inference. I am learning it and found it very helpful with my work.
I took a look at the examples in tutorial and some functions, I found that all examples dealt with the static data with vector shape. I wonder if the toolkit is able to handle time-series data, such as data with a shape of m by n by k, m denotes the number of datasets, n denotes the time steps, and k denotes the number of features.
Best,

Jice

Improve the `build` docstrings

The documentation of the build constructor argument in NRE/NPE/NSE specifies

build: Callable[[int, int], nn.Module] = MLP

and

build: The network constructor

In my opinion, this does not really help in understanding how this constructor argument should be used, as there is no mention of what the ints are referring to. It is also not clear what the nn.Module should expect as inputs and produce as outputs.

In the tutorial about embeddings, it could also be helpful to explain how the network produced with build is different from the network used to build an embedding.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.