Coder Social home page Coder Social logo

eawag-siam / simulatedannealingabc.jl Goto Github PK

View Code? Open in Web Editor NEW
1.0 4.0 0.0 1.09 MB

Approximate Bayesian Computation algorithm based on simulated annealing

License: GNU General Public License v3.0

Julia 100.00%
approximate-bayesian-computation approximate-bayesian-inference bayesian-inference mcmc-sampling

simulatedannealingabc.jl's Introduction

SimulatedAnnealingABC

Dev Build Status Coverage

Installation

The package is under development and not yet officially registered. You can install it from GitHub with

] add https://github.com/Eawag-SIAM/SimulatedAnnealingABC.jl

Note, Julia 1.9 or newer is needed.

Usage

Here we show how to run the SimulatedAnnealingABC (SABC) algorithm to infer model parameters given some observed dataset.

Data (observations)

Let us start off by generating a synthetic dataset, our "observations". Consider for instance a normally distributed random sample.

using Random
using Distributions

Random.seed!(1111)					  
true_μ = 3                                                # "true" μ - to be inferred
true_σ = 15                                               # "true" σ - to be inferred
num_samples = 100                                         # dataset size
y_obs = rand(Normal(true_μ, true_σ), num_samples)         # our "observations"

Model

Now, we need to define a data-generating model. In this example, quite obviously, we opt for a Gaussian model $\mathcal{N}(\mu,\sigma)$. The goal is to infer a posterior distribution for the parameters $\mu$ and $\sigma$ given the dataset y_obs.

# Data-generating model
# θ = [μ,σ] - parameter set
function model(θ)
	y = rand(Normal(θ[1],θ[2]), num_samples)
	return y
end

For a given parameter set $\theta = \left( \mu, \sigma \right)$, the model function generates a dataset of size num_samples.

Prior for model parameters

We also need to define a prior distribution for the model parameters that are to be inferred. In this case we choose a Uniform distribution for both parameters.

# Prior for the parameters to be inferred 
μ_min = -10; μ_max = 20    # parameter 1: mean
σ_min = 0; σ_max = 25      # parameter 2: std
prior = product_distribution(Uniform(μ_min, μ_max), Uniform(σ_min, σ_max))

Summary statistics

In general, the core of ABC algorithms consists in using the model as a forward simulator to generate a large number of synthetic datasets, for different sets of model parameters. The latter are then accepted or rejected by measuring the discrepancy (or "distance", according to some metric) between the observations and the model-generated data. However, an inference based on the direct comparison of observations and model-based data is not always feasible, e.g., in case of high dimensional datasets. In such cases, the discrepancy is measured by computing a distance between relevant summary statistics extracted from the datasets. In our toy example, we use the empirical mean and standard deviation of the data as summary statistics.

function sum_stats(data)
	stat1 = mean(data)
	stat2 = std(data) 
	return [stat1, stat2]
end

The function sum_stats returns an array of size n_stats, where n_stats denotes the number of summary statistics (2 in our case).

n_stats = size(sum_stats(y_obs), 1)

We can now reduce our "observations" to a (low-dimensional) set of summary statistics.

ss_obs = sum_stats(y_obs)

Re-definition of the model

The inference algorithm will compare observations and model outputs in terms of summary statistics. Therefore, we can reformulate the definition of the data-generating model in such a way that the data are directly compressed into a corresponding set of summary statistics.

# Data-generating model + reduction to summary statistics
# θ = [μ,σ] - parameter set
function model(θ)
	y = rand(Normal(θ[1],θ[2]), num_samples)
	return sum_stats(y)
end

Now, the model function returns a low-dimensional array of length n_stats.

Distance function

Lastly, we need to define a distance function. The distance function MUST return an array containing the INDIVIDUAL distances, for EACH summary statistics, between observations and model outputs. In other words, given n_stats summary statistics, the distance function will ALWAYS return an array of length n_stats.

using Distances

function f_dist(θ)
	ss = model(θ)    # data-generating model (returns summary statistics)
	rho = [euclidean(ss[ix], ss_obs[ix]) for ix in 1:n_stats]   # distance
	return rho
end

The algorithm will keep track of the evolution of EACH individual distance (for EACH summary statistics), INDEPENDENTLY on the type of SABC algorithm used in the inference (see below for more details on the available options for algorithm type).

SABC inference

The following functions have been implemented up to this point:

  1. sum_stats (reduces a high-dimensional dataset to a low-dimensional set of statistics)
  2. model (given model parameters θ, generates a dataset and returns summary statistics)
  3. f_dist (computes the distance between individual summary statistics)
  4. prior (prior for the parameters to be inferred)

We are now ready to run the SABC inference!

We decide to use 1000 particles and start with 1 million particle updates (the algorithm allows us to continue the inference afterwards).

np = 1000       # number of particles
ns = 1_000_000  # number of particle updates

Note that the total number of population updates is given by n_pop_updates = div(ns, np). Note also that the generation of the initial prior sample is counted as a population update step. Therefore, the total number of population updates after initialization will effectively be n_pop_updates - 1.

The sabc function requires two inputs:

  1. the distance function f_dist, which contains the data generating model,
  2. and the prior.

Additional arguments that can be passed to the function are the following.

  • n_particles: number of particles used in the inference.
  • n_simulation: total number of particle updates.
  • v: annealing speed. Recommended values are:
    • v=1 for single-ϵ algorithms (type "single" and "hybrid", default value).
    • v=10 for multi-ϵ (type "multi").
  • type: to choose the specific SABC algorithm.
    • type="single" is single-ϵ.
    • type="multi" is multi-ϵ.
    • type="hybrid" is the hybrid multi-u-single-ϵ algorithm.
  • checkpoint_display: every how many population updates information on the inference progress is displayed. Default value is 100. Recommended value with a long inference is 500 or more (to avoid unnecessarily lengthy output files). Note that n population updates correspond to n*n_particles particle updates.

The following arguments rarely require adjustments.

  • resample: number of successful particle updates between consecutive particle resampling events. Default value is 2*n_particles.
  • β: tuning hyperparameter governing the equilibration/mixing speed, used to rescale the jump covariance matrix. Default value is β=0.8.
  • δ: tuning hyperparameter governing the size of population resampling. Default value is δ=0.1.
  • checkpoint_history: every how many population updates information on the inference progress is stored for postprocessing. Default value is 1.

We will run the three different algorithms (type = "single","multi","hybrid") and compare their outputs.

using SimulatedAnnealingABC

# --- TYPE 1 -> single-ϵ ---
out_single_eps = sabc(f_dist, prior; n_particles = np, n_simulation = ns, v = 1.0, type = "single")
display(out_single_eps)

# --- TYPE 2 -> multi-ϵ ---
out_multi_eps = sabc(f_dist, prior; n_particles = np, n_simulation = ns, v = 10.0, type = "multi")
display(out_multi_eps)

# --- TYPE 3 -> hybrid multi-u-single-ϵ ---
out_hybrid = sabc(f_dist, prior; n_particles = np, n_simulation = ns, v = 1.0, type = "hybrid")
display(out_hybrid)

Get SABC output

We can now extract posterior populations as well as trajectories for ϵ's, ρ's (distances) and u's ("transformed" distances).

pop_singeps = hcat(out_single_eps.population...)
eps_singeps = hcat(out_single_eps.state.ϵ_history...)
rho_singeps = hcat(out_single_eps.state.ρ_history...)
u_singeps = hcat(out_single_eps.state.u_history...)

pop_multeps = hcat(out_multi_eps.population...)
eps_multeps = hcat(out_multi_eps.state.ϵ_history...)
rho_multeps = hcat(out_multi_eps.state.ρ_history...)
u_multeps = hcat(out_multi_eps.state.u_history...)

pop_hybrid = hcat(out_hybrid.population...)
eps_hybrid = hcat(out_hybrid.state.ϵ_history...)
rho_hybrid = hcat(out_hybrid.state.ρ_history...)
u_hybrid = hcat(out_hybrid.state.u_history...)

The output files have the following dimensions:

  • populations: (n_stats, np)
  • ϵ trajectories:
    • (1, n_pop_updates) for single-ϵ algorithms (type "single" and "hybrid")
    • (n_stats, n_pop_updates) for multi-ϵ (type "multi")
  • ρ trajectories: (n_stats, n_pop_updates + 1)
  • u trajectories:
    • (1, n_pop_updates) for type "single"
    • (n_stats, n_pop_updates) for types "multi" and "hybrid"

Remarks about ρ trajectories:

  • ρ trajectories are always stored as individual trajectories for each summary statistics, even for algorithm of type "single", where a single ρ is effectively used for the inference.
  • Moreover, note that each trajectory has length n_pop_updates + 1. This is because also the distances of the prior sample, before rescaling, are stored as first element of the ρ array.

To continue the inference

After analysing the output, we may decide to update the current population with another 1_000_000 simulations. That can be easily done with the function update_population!. We show here for example how to continue the inference for the single-ϵ algorithm (type "single"). We run another ns simulations. Note that type and v should be consistent with the algorithm used to generate the first batch.

out_single_eps_2 = update_population!(out_single_eps, f_dist, prior; n_simulation = ns, v = 1.0, type = "single")

One may also want to store the output file and decide whether to continue the inference later. One way to save SABC output files is to use serialize. Here is an example.

using Serialization

# replace [output path] with appropriate path
serialize("[output path]/sabc_result_single_eps", out_single_eps) 

To continue the inference one can proceed as follows.

# replace [output path] with appropriate path
out_single_eps_1 = deserialize("[output path]/sabc_result_single_eps")
out_single_eps_2 = update_population!(out_single_eps_1, f_dist, prior; n_simulation = 1_000_000, v = 1.0, type = "single")

Results

A sample from the true posterior can be easily obtained using the AffineInvariantMCMC package as follows.

using AffineInvariantMCMC

# log-likelihood 
llhood = θ -> begin
	μ, σ  = θ;
	return -num_samples*log(σ) - sum((y_obs.-μ).^2)/(2*σ^2)
end

# log-prior
lprior = θ -> begin
	μ, σ  = θ;
	if (μ_min <= μ <= μ_max) && (σ_min <= σ <= σ_max)
		return 0.0
	else
		return -Inf
	end
end

# log-posterior
lprob = θ -> begin
	μ, σ  = θ;
	lp = lprior(theta)
	if isinf(lp) 
		return -Inf
	else
		return lp + llhood(theta)
	end
end

# Generate 1000 posterior samples
numdims = 2
numwalkers = 10
thinning = 10
numsamples_perwalker = 1000
burnin = 1000;

rng = MersenneTwister(11);
theta0 = Array{Float64}(undef, numdims, numwalkers);
theta0[1, :] = rand(rng, Uniform(μ_min, μ_max), numwalkers); 
theta0[2, :] = rand(rng, Uniform(σ_min, σ_max), numwalkers); 

chain, llhoodvals = runMCMCsample(lprob, numwalkers, theta0, burnin, 1);
chain, llhoodvals = runMCMCsample(lprob, numwalkers, chain[:, :, end], numsamples_perwalker, thinning);
flatchain, flatllhoodvals = flattenMCMCarray(chain, llhoodvals)

# plot
scatter(flatchain[2,:], flatchain[1,:], markercolor = :yellow, label="true posterior", xlims = (σ_min,σ_max), ylims = (μ_min,μ_max), xlabel = "std", ylabel = "mean")
image

And here are the posterior samples after 2 million particle updates obtained with the three different SABC algorithms, compared to the true posterior.

image

References

Albert, C., Künsch, H. R., & Scheidegger, A. (2015). A simulated annealing approach to approximate Bayes computations. Statistics and Computing, 25(6), 1217–1232.

simulatedannealingabc.jl's People

Contributors

carloalbert avatar scheidan avatar ulzegasi avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

simulatedannealingabc.jl's Issues

Extension to multiple summary statistics

After some discussion Carlo and I came to the following, preliminary conclusions.

We get multiple distances (rho) from f_dist. We change the initialization as follows:

  1. get n_particle prior samples and compute distances
  2. learn the multivariate CDF of the distances -> this function maps a vector of summary distances to the scalar number u.
  3. Choose epsilon_init (as epsilon_init relates now to u it shoud have an easy interpretation)
  4. Keep sampling with acceptance probability exp(- u ./ epsilon_init) until we have n_particle samples.
  5. (optional) relearn the multivariate CDF form the larger sample.

We need to think about:

  • How to efficiently compute the multivariate cdf? See e.g. this old paper, this newer paper, or here. Ideally we find a an implementation for this.
  • Should we use the empirical cdf (steps) or some kind of interpolation?

initial epsilon

Initialization

  1. Sample from prior
  2. Do a resampling step (this corresponds to setting \epsilon_init = /delta, still quite large but better than \epsilon_init = infty).
  3. Invoke update_epsilon function

Automatically find `eps_init` and allow for multible summary statistics

Get rid of eps_init:

  1. Sample n_particles from prior
  2. Calculate for each summery statistic: epsilon_init = mean(distance) (needs to be tested, maybe quantile better)
  3. compute acceptance prob: w = exp(-sum(distances ./ epsilon_inits))
  4. Build population using prior population and weights w
  5. Continue sampling until population is full
  6. Learn cdfs from prior sample for each summary stat

Return state (for restart)

posterior population (pars and u's), cdf, epsilon, Sigma

Define a functionupdate_step!(state) that is called in the main loop.

Parallelization

  1. Update every particle in population in parallel
  2. Updates of epsilons and Sigma only after each population update
  3. Each thread counts number of accepted updates
  4. If total accepted updates after one population update larger than x% of n_particles, do the updates of epsilon and Sigma
  5. If total accepted updates after several pop updates larger than n_particles, do a importance sampling

Multi-threading with `FLoops.jl`

FLoops.jl could be a nice options to implement multi-threading. It allows us to switch between serial and parallel computation.

using FLoops

function bar(n, executor)
    X = zeros(n)
    @floop executor for i in 1:n
        sleep(0.1)
        if rand() < 0.3
            X[i] = X[i] + 1
            @reduce(k +=1)
        end
    end
    (X, k)
end

x, k = bar(20, SequentialEx())
sum(x)
x, k = bar(20, ThreadedEx())
sum(x)

@time bar(20, SequentialEx())
@time bar(20, ThreadedEx())

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.