Coder Social home page Coder Social logo

maxbiostat / r0_uncertainty Goto Github PK

View Code? Open in Web Editor NEW
6.0 7.0 2.0 157.09 MB

Some remarks on prior modelling for the basic reproductive number in the Susceptible-Infected-Recovered (SIR) epidemic model

Stan 0.16% R 0.77% Jupyter Notebook 99.04% D 0.03%
epidemiology uncertainty-analysis sir r0 gamma-ratio-distribution prior-distribution

r0_uncertainty's Introduction

r0_uncertainty's People

Contributors

fccoelho avatar lsbastos avatar marciomacielbastos avatar maxbiostat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

r0_uncertainty's Issues

Pesky warnings

Our current implementations (both 'real' and 'log') give off an awful amount of warnings. Here are a few:

## Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: lognormal_lpdf: Location parameter[5] is -nan, but must be finite

or

## Exception: ode_rk45: Failed to integrate to next output time (2) in less than max_num_steps steps

which usually lead to

Informational Message: The current Metropolis proposal is about to be rejected because of the
## Chain 1 Exception: ode_rk45: ode parameters and data[1] is inf, but must be finite!

This slows things down by forcing rejections and wasted target and gradient evaluations.

It would be nice to figure out what is going on so we can do away with these warnings and achieve faster computation.

correct implementation of moment matching

The current moment-matching code is wrong. It uses k/theta, when the correct is k * theta.

I will create a function that does that, and then call that across the scripts/notebooks.

Good prior for (gamma, beta)

I wonder if a bivariate log-normal prior would be a good thing to try. Here's a sample posterior for beta and gamma:
image
Inference for the Eyam plague data [code coming to the repo soon].

Test single ODE versus two ODEs

This analysis is to find out whether using the single-ODE parametrisation of the closed-population SIR leads to a substantial gain/loss in two respects:

(i) Computational: are there differences in running time?
(ii) Statistical: Do we get lower MCSE and higher ESS (n_eff)?

The single ODE implementation is here and the double ODE implementation is here, but we need to make sure EVERYTHING ELSE is the same between implementations to ensure comparability

Test the priors in this example

Code is here and paper is this one.

Main questions:

  • Would change the priors lead to different inferences for beta and gamma?
  • Considering the expression for R0 (page 9, Appendix of the paper): (i) what is the prior induced distribution and (ii) what is the posterior induced distribution?

Determine which models we are going to explore

Contrary to what you might think, there are a plethora of models out there, with all sorts of likelihood functions and all. See #10.

All epidemic models that have an R0 that can be written as a Gamma ratio are interesting in principle. Notice that if, say, R0 = mu/(nu + gamma), that's interesting, because the sum of two Gamma r.v.s is Gamma, and hence this would qualify.

General questions

  • Can we show fitting problems with very diffuse priors?
  • Can we show unreasonable posteriors with very diffuse priors? For both R0 and predicted trajectories?
  • Is there a difference in performance (computational and statistical) between a diffuse Gamma ratio (i.e. diffuse Gammas on the rates) and the corresponding moment-matching log-normal?

Compare choices for the likelihood

This issue deals with the choice of likelihood. Let the data be Y_t and and let the solution to the ode be mu(t).

We can either say that

Y_t ~ negative_binomial(mu(t), sigma)

or

Y_t  ~ lognormal(log(mu(t)), sigma)

We should test (a) if these make a difference in terms of parameter estimates and ESSs and (b) whether one runs faster than the other.

Prior sensitivity

Our whole argument hinges on the fact that the prior might remotely matter. After a very informal search of the literature, I've found many instances of people claiming that "the inferences were robust to the choice of prior". A couple possibilities are:

  • people lie;
  • people are idiots: inferences actually do change, but they haven't looked closely enough to notice, since it'd be an inconvenience;
  • Many/most likelihoods employed in the literature are very dominant w.r.t. the posterior.

If that last one is also the case [the first two are a certainty :) ] it would be very nice to know why and when the likelihood is robust.

Improvements to simulation code

  • Remove all non-essential solvers (it's confusing);
  • Move the re-parametrisation of the Gamma and Beta distributions somewhere else (it's going to be useful when we talk about priors for the parameters -- reparametrising in terms of mean and standard deviation is a great idea -- but not for this step);
  • Remove noise on parameters beta (transmission rate) and gamma (recovery rate): these should be fixed for a given simulation;
  • Add (multiplicative) noise to the simulated trajectory. R code with something similar can be found here.

Include half normal priors on the rates (SIR model)

We have three choices for priors on the rates beta and gamma:

  • Gamma priors -> Gamma ratio distribution on R_0
  • Log-normal priors -> log-normal prior on R_0
  • Half-normal priors -> Weird distribution R_0

I have derived the density for the ratio of half normals. It ain't pretty, but it's closed-form. What would be nice here would be to implement the Boarding School example also with halfnormal priors as done here

Check correction for population size

To increase realism one can correct the prior on R0 to have bounded support [0, N]. This is implemented for the Gamma ratio in the function dgamma.ratio.Ncorrected(), here.

Question is whether this makes any difference in practice.

Information sources

Here's a disorganised and hopefully ever-growing list of papers/links/etc that might be useful for the project:

https://www.sciencedirect.com/science/article/pii/S1477893918300577?via%3Dihub ## Uniform(1,3) on R0
https://sci-hub.tw/https://doi.org/10.1016/j.epidem.2013.07.001 ## log-normal; also claim not sensitive to prior specification
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4345055/ ## Beta various hyperparameters; mild prior sensitivity
https://www.biostat.washington.edu/sites/default/files/modules/2017_SISMID_11_Lecture_2.pdf ## source for likelihoods
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0045107 ## does likelihood, could adapt for Bayes
https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.6041 ## interesting paper, MUST CITE
http://rspb.royalsocietypublishing.org/content/283/1830/20160618 ## one of the many analyses of the Eyam plague data Uniform(0, 100) priors... '.__.
https://arxiv.org/pdf/1903.00423.pdf ## Many interesting models, implemented in Stan.
https://academic.oup.com/biostatistics/article/15/1/46/244669 ## Model selection for household models (look at page 48 for priors).

Implement log version of SIR

Exploit the fact that dlog(y)/dt = 1/y dy/dt to reparametrise the ODE systems we use and see if this leads to better geometry and/or faster sampling.

Priors for R0 -- catalogue

Volz & Silveroni, (2018): R_0 = beta/gamma, prior: log-normal(0, sd = 1).

Lam & Suchard (2018): R_0 = N*beta/gamma, priors: gamma and beta have a log-normal(0, 100) prior. N = 261.

McKinley et al (2019) Exp(1), i.e. Gamma(1, 1) on all rates.

Grinsztajn et al. (2020) use half normal priors on beta and gamma. In particular, mean_beta = 2, sd_beta = 1 and mean_gamma=0.4 and sd_gamma=0.5.

Rupp et al. (2021) use uniform priors on the (log) rates (!!!!). Here's a quote:

We do a full Bayesian analysis for parameter pairs (log α, log β) with a uniform prior. Following Ho, Crawford, et
al. (2018) we sample from the joint posterior using a Hamiltonian Monte Carlo (HMC) scheme (Duane et al. 1987; Neal
2011).

check R0 reparametrisation

@fccoelho has provided a simple explanation of how to formulate the classical SIR in terms of R0, here.
It'd be quite nice to check that the maths is correct by running a simulation with the original fomulation and the new one and checking the trajectories match exactly.
@fccoelho if you've already checked, please feel free to close.

Does prior containment work?

Grinsztajn et al. (2021) argue (pp 6217) that some loose bounds for R0 are [1, 10]. I sort of agree. But what I wanna know is this: if I write

Pr(R0 \in   [1, 10]  | w) >= alpha,

where w are the prior hyperparameters and alpha \in (0, 1) is a prior probability level,
(i) is it easy to set w to achieve a certain alpha? How does that look for each of {gamma, log-normal, half-normal} priors?
(ii) Does this work in the sense of guaranteeing non-degenerate prior predictives and posterior inferences?
One reason to believe (ii) is shaky is that for (certain) gamma priors and the half-normal priors, the induced prior on R0 is heavy-tailed. Thus it might still assign non-trivial mass to intervals in the vicinity of 100, say.

Find a "MWE" with Stan

We need something to start with that

  • simulates some data under a known ground truth;
  • fits the model to the synthetic data and returns posteriors for the parameters
  • maybe also returns predictions (e.g. a predictive quantity of interest is the time of the epidemic peak)

Here's a start

Posterior predictive checks

One of the things we should be doing when comparing priors is:

  • Prior predictive checks: what sort of trajectory I(t) [or R(t)] does a particular prior on R0 or beta and gamma induce;
    and
  • Posterior predictive checks: sure, a prior might change the posterior slightly or even a lot, but what is the effect on the predicted quantities of the model, such as the time to peak incidence or the trajectory R(t)?

Experiment 1: simple SIR

The goal of this experiment is to understand two things:

  • a) How do the priors on gamma and beta affect inferences for R0?

To answer a) we could:

  • Run the two ODE model parametrised in terms of gamma and beta:

    • Informative log-normal priors;
    • "Uninformative" Gamma(1, 1) priors;
    • "Uninformative" half-normal(0, 1) priors;
    • "Uninformative" log-normal priors: mu = 0 and sd =1, 10, 100
  • Run the two ODE model parametrised in terms of gamma and R0 with

    • Informative Gamma priors on both gamma and R0;
    • Informative half-normal priors on both gamma and R0;
    • Informative log-normal priors on both gamma and R0;
    • An informative Gamma prior on R_0 and an uninformative Gamma prior on gamma.
    • An informative Gamma prior on R_0 and an uninformative half-normal prior on gamma.
    • An informative log-normal prior on R_0 and an uninformative log-normal prior on gamma.
    • An informative log-normal prior on R_0 truncated at 1 (epidemic prior) and an uninformative prior on gamma.

Quantities to track

  • mean, median and BCI;
  • Divergences;
  • ESS;
  • Run time.

Q: Why use different prior families? A: to assess tail effects.
Obs: we should try to make all of the "informative" priors be moment-matching. E.g.: if we pick a Gamma(2, 1) prior for R0, then the informative half-normal prior would have to be a half-normal(m0, s0), with m0and s0 chosen so as to have the same first two moments.

Bits and pieces

  • Write down moments (E[R_0] and Var(R_0)) for the (i) log-normal (ii) half-normal-ratio;
  • Write down how to do moment-matching for the log-normal;
  • Port this script to use cmdstanr;

check that (log) concavity "result" is useful

I wrote it mainly because I thought it was important to show where the density is concave. But then I realised that the bound I derived is always larger than the mode (and zero when the mode is zero).

The main concern about concavity is exactly ensuring the distribution is (strongly) unimodal. So I'm unsure the result in \ref{eq:logconcavity} is useful.

Experiment 2: single ODE parametrisation

  • b) Does using a single ODE parametrisation help?

To answer b) we could

  • Run the single ODE model with the same (informative and uniformative) priors on R0 as in #37 and then record the quantities of interest:

Quantities to track

  • Mean, median and BCI;
  • Divergences;
  • ESS;
  • Run time.

Evaluate the performance of Gamma and log-normal priors on simulated data

We could do an experiment where we fix beta, and R0 (and thus we know gamma) and S0 and also the log-normal likelihood variance sigma_y^2. We then simulate a SIR trajectory from this process and then fit the model using (i) Gamma priors on beta and gamma and (ii) moment-matching log-normal priors on beta and gamma. We then look at the recovery of R0both in the squared error of the posterior mean and coverage of the Bayesian credibility intervals (BCI).

Check the functions generate_trajectory_* I used to generate the prior predictives in the notebooks. They can be easily modified to generate data from fixed parameters.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.