maxbiostat / r0_uncertainty Goto Github PK

Some remarks on prior modelling for the basic reproductive number in the Susceptible-Infected-Recovered (SIR) epidemic model

Stan 0.16% R 0.77% Jupyter Notebook 99.04% D 0.03%

epidemiology uncertainty-analysis sir r0 gamma-ratio-distribution prior-distribution

r0_uncertainty's Introduction

On the prior modelling for the basic reproductive number.

Marcio Bastos¹, Daniel Villela²,Leonardo Bastos ², Flavio Coelho¹ and Luiz Max de Carvalho¹.

¹ School of Applied Mathematics, Getulio Vargas Foundation (FGV), Rio de Janeiro -- RJ, Brazil. ² Program for Scientific Computing, Oswaldo Cruz Foundation, Rio de Janeiro -- RJ, Brazil.

r0_uncertainty's People

Contributors

Stargazers

Watchers

Forkers

fccoelho slowcoder-fastsleeper

r0_uncertainty's Issues

Migrate to the conference template

check out this course

https://github.com/jriou/bayesian_workflow_sir/blob/main/slides.pdf

Approximate solution to dR/dt

https://math.unm.edu/~sulsky/mathcamp/ApplyData.pdf. Have to cross-check with the notes.

Compare equal-weights \alpha with "problem-specific" \alpha

We could construct the vector of weights based on population/epidemic sizes or something related to the reliability of each data source (e.g. how many reporting units per capta).

Pesky warnings

Our current implementations (both 'real' and 'log') give off an awful amount of warnings. Here are a few:

## Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: lognormal_lpdf: Location parameter[5] is -nan, but must be finite

## Exception: ode_rk45: Failed to integrate to next output time (2) in less than max_num_steps steps

which usually lead to

Informational Message: The current Metropolis proposal is about to be rejected because of the
## Chain 1 Exception: ode_rk45: ode parameters and data[1] is inf, but must be finite!

This slows things down by forcing rejections and wasted target and gradient evaluations.

It would be nice to figure out what is going on so we can do away with these warnings and achieve faster computation.

get interesting distributions on the parameters to perform pooling

correct implementation of moment matching

The current moment-matching code is wrong. It uses k/theta, when the correct is k * theta.

I will create a function that does that, and then call that across the scripts/notebooks.

As discussed today, everything we prototype on the simple SIR model should also be done for a bigger, more complex model. My suggestion is to take the model and data from https://github.com/charlesm93/disease_transmission_workflow

We'd need to modernise the Stan code and then test our stuff. But should be doable.

@fccoelho , what do you think?

cite this paper

https://arxiv.org/pdf/2112.07039.pdf

code for fitting simple SIR

code to produce figure in #21

Good prior for (gamma, beta)

I wonder if a bivariate log-normal prior would be a good thing to try. Here's a sample posterior for beta and gamma:

Inference for the Eyam plague data [code coming to the repo soon].

Test single ODE versus two ODEs

This analysis is to find out whether using the single-ODE parametrisation of the closed-population SIR leads to a substantial gain/loss in two respects:

(i) Computational: are there differences in running time?
(ii) Statistical: Do we get lower MCSE and higher ESS (n_eff)?

The single ODE implementation is here and the double ODE implementation is here, but we need to make sure EVERYTHING ELSE is the same between implementations to ensure comparability

Test the priors in this example

Code is here and paper is this one.

Main questions:

Would change the priors lead to different inferences for beta and gamma?
Considering the expression for R0 (page 9, Appendix of the paper): (i) what is the prior induced distribution and (ii) what is the posterior induced distribution?

Determine which models we are going to explore

Contrary to what you might think, there are a plethora of models out there, with all sorts of likelihood functions and all. See #10.

All epidemic models that have an R0 that can be written as a Gamma ratio are interesting in principle. Notice that if, say, R0 = mu/(nu + gamma), that's interesting, because the sum of two Gamma r.v.s is Gamma, and hence this would qualify.

General questions

Can we show fitting problems with very diffuse priors?
Can we show unreasonable posteriors with very diffuse priors? For both R0 and predicted trajectories?
Is there a difference in performance (computational and statistical) between a diffuse Gamma ratio (i.e. diffuse Gammas on the rates) and the corresponding moment-matching log-normal?

Compare choices for the likelihood

This issue deals with the choice of likelihood. Let the data be Y_t and and let the solution to the ode be mu(t).

We can either say that

Y_t ~ negative_binomial(mu(t), sigma)

Y_t  ~ lognormal(log(mu(t)), sigma)

We should test (a) if these make a difference in terms of parameter estimates and ESSs and (b) whether one runs faster than the other.

Decide how randomness will be added to the deterministic model

Randomness must maintain the model representative of an actual epidemic.

Prior sensitivity

Our whole argument hinges on the fact that the prior might remotely matter. After a very informal search of the literature, I've found many instances of people claiming that "the inferences were robust to the choice of prior". A couple possibilities are:

people lie;
people are idiots: inferences actually do change, but they haven't looked closely enough to notice, since it'd be an inconvenience;
Many/most likelihoods employed in the literature are very dominant w.r.t. the posterior.

If that last one is also the case [the first two are a certainty :) ] it would be very nice to know why and when the likelihood is robust.

Improvements to simulation code

Remove all non-essential solvers (it's confusing);
Move the re-parametrisation of the Gamma and Beta distributions somewhere else (it's going to be useful when we talk about priors for the parameters -- reparametrising in terms of mean and standard deviation is a great idea -- but not for this step);
Remove noise on parameters beta (transmission rate) and gamma (recovery rate): these should be fixed for a given simulation;
Add (multiplicative) noise to the simulated trajectory. R code with something similar can be found here.

TODO: put the figures from KL in

Have to put the figures for the distributions of \beta and \gamma in as well.

Include half normal priors on the rates (SIR model)

We have three choices for priors on the rates beta and gamma:

Gamma priors -> Gamma ratio distribution on R_0
Log-normal priors -> log-normal prior on R_0
Half-normal priors -> Weird distribution R_0

I have derived the density for the ratio of half normals. It ain't pretty, but it's closed-form. What would be nice here would be to implement the Boarding School example also with halfnormal priors as done here

Check this paper out

https://arxiv.org/abs/2012.11542

Check correction for population size

To increase realism one can correct the prior on R0 to have bounded support [0, N]. This is implemented for the Gamma ratio in the function dgamma.ratio.Ncorrected(), here.

Question is whether this makes any difference in practice.

Example

The Ebola example is not the best possible. We'd better look for another one.

@fccoelho , @lsbastos and @DVMath

Information sources

Here's a disorganised and hopefully ever-growing list of papers/links/etc that might be useful for the project:

https://www.sciencedirect.com/science/article/pii/S1477893918300577?via%3Dihub ## Uniform(1,3) on R0
https://sci-hub.tw/https://doi.org/10.1016/j.epidem.2013.07.001 ## log-normal; also claim not sensitive to prior specification
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4345055/ ## Beta various hyperparameters; mild prior sensitivity
https://www.biostat.washington.edu/sites/default/files/modules/2017_SISMID_11_Lecture_2.pdf ## source for likelihoods
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0045107 ## does likelihood, could adapt for Bayes
https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.6041 ## interesting paper, MUST CITE
http://rspb.royalsocietypublishing.org/content/283/1830/20160618 ## one of the many analyses of the Eyam plague data Uniform(0, 100) priors... '.__.
https://arxiv.org/pdf/1903.00423.pdf ## Many interesting models, implemented in Stan.
https://academic.oup.com/biostatistics/article/15/1/46/244669 ## Model selection for household models (look at page 48 for priors).

Implement log version of SIR

Exploit the fact that dlog(y)/dt = 1/y dy/dt to reparametrise the ODE systems we use and see if this leads to better geometry and/or faster sampling.

Priors for R0 -- catalogue

Volz & Silveroni, (2018): R_0 = beta/gamma, prior: log-normal(0, sd = 1).

Lam & Suchard (2018): R_0 = N*beta/gamma, priors: gamma and beta have a log-normal(0, 100) prior. N = 261.

McKinley et al (2019) Exp(1), i.e. Gamma(1, 1) on all rates.

Grinsztajn et al. (2020) use half normal priors on beta and gamma. In particular, mean_beta = 2, sd_beta = 1 and mean_gamma=0.4 and sd_gamma=0.5.

Rupp et al. (2021) use uniform priors on the (log) rates (!!!!). Here's a quote:

We do a full Bayesian analysis for parameter pairs (log α, log β) with a uniform prior. Following Ho, Crawford, et
al. (2018) we sample from the joint posterior using a Hamiltonian Monte Carlo (HMC) scheme (Duane et al. 1987; Neal
2011).

check R0 reparametrisation

@fccoelho has provided a simple explanation of how to formulate the classical SIR in terms of R0, here.
It'd be quite nice to check that the maths is correct by running a simulation with the original fomulation and the new one and checking the trajectories match exactly.
@fccoelho if you've already checked, please feel free to close.

Does prior containment work?

Grinsztajn et al. (2021) argue (pp 6217) that some loose bounds for R0 are [1, 10]. I sort of agree. But what I wanna know is this: if I write

Pr(R0 \in   [1, 10]  | w) >= alpha,

where w are the prior hyperparameters and alpha \in (0, 1) is a prior probability level,
(i) is it easy to set w to achieve a certain alpha? How does that look for each of {gamma, log-normal, half-normal} priors?
(ii) Does this work in the sense of guaranteeing non-degenerate prior predictives and posterior inferences?
One reason to believe (ii) is shaky is that for (certain) gamma priors and the half-normal priors, the induced prior on R0 is heavy-tailed. Thus it might still assign non-trivial mass to intervals in the vicinity of 100, say.

move Gamma ratio section to uncertainty analysis

Find a "MWE" with Stan

We need something to start with that

simulates some data under a known ground truth;
fits the model to the synthetic data and returns posteriors for the parameters
maybe also returns predictions (e.g. a predictive quantity of interest is the time of the epidemic peak)

Here's a start

Extended abstract

@fccoelho @DVMath @lsbastos should all read and criticise the text. Additionally you guys may want to put your funding details in. Also see #4 .

Posterior predictive checks

One of the things we should be doing when comparing priors is:

Prior predictive checks: what sort of trajectory I(t) [or R(t)] does a particular prior on R0 or beta and gamma induce;
and
Posterior predictive checks: sure, a prior might change the posterior slightly or even a lot, but what is the effect on the predicted quantities of the model, such as the time to peak incidence or the trajectory R(t)?

Change the informative prior example to reflect a real-world example

Currently using some arbitrary values for the hyperparameters. I've marked what needs to be changed like [THIS].

Code to produce the figure is here. Modify accordingly.

Experiment 1: simple SIR

The goal of this experiment is to understand two things:

a) How do the priors on gamma and beta affect inferences for R0?

To answer a) we could:

Run the two ODE model parametrised in terms of gamma and beta:
- Informative log-normal priors;
- "Uninformative" Gamma(1, 1) priors;
- "Uninformative" half-normal(0, 1) priors;
- "Uninformative" log-normal priors: mu = 0 and sd =1, 10, 100
Run the two ODE model parametrised in terms of gamma and R0 with
- Informative Gamma priors on both gamma and R0;
- Informative half-normal priors on both gamma and R0;
- Informative log-normal priors on both gamma and R0;
- An informative Gamma prior on R_0 and an uninformative Gamma prior on gamma.
- An informative Gamma prior on R_0 and an uninformative half-normal prior on gamma.
- An informative log-normal prior on R_0 and an uninformative log-normal prior on gamma.
- An informative log-normal prior on R_0 truncated at 1 (epidemic prior) and an uninformative prior on gamma.

Quantities to track

mean, median and BCI;
Divergences;
ESS;
Run time.

Q: Why use different prior families? A: to assess tail effects.
Obs: we should try to make all of the "informative" priors be moment-matching. E.g.: if we pick a Gamma(2, 1) prior for R0, then the informative half-normal prior would have to be a half-normal(m0, s0), with m0and s0 chosen so as to have the same first two moments.

View and analyse this discussion

https://statmodeling.stat.columbia.edu/2019/08/10/for-each-parameter-or-other-qoi-compare-the-posterior-sd-to-the-prior-sd-if-the-posterior-sd-for-any-parameter-or-qoi-is-more-than-0-1-times-the-prior-sd-then-print-out-a-note-the-prior-dist/

In particular, the bit about sensitivity being the derivative of a parameter (quantity of interest) with respect to a prior hyperparameter.

Bits and pieces

Write down moments (E[R_0] and Var(R_0)) for the (i) log-normal (ii) half-normal-ratio;
Write down how to do moment-matching for the log-normal;
Port this script to use cmdstanr;

check that (log) concavity "result" is useful

I wrote it mainly because I thought it was important to show where the density is concave. But then I realised that the bound I derived is always larger than the mode (and zero when the mode is zero).

The main concern about concavity is exactly ensuring the distribution is (strongly) unimodal. So I'm unsure the result in \ref{eq:logconcavity} is useful.

Can we do a formal sensitivity analysis?

This paper proposes a way of formally assessing prior-data conflict and sensitivity. Might be a very nice plus once all of the building blocks are in place.

Experiment 2: single ODE parametrisation

b) Does using a single ODE parametrisation help?

To answer b) we could

Run the single ODE model with the same (informative and uniformative) priors on R0 as in #37 and then record the quantities of interest:

Quantities to track

Mean, median and BCI;
Divergences;
ESS;
Run time.

Implement Gamma priors for beta, gamma

@marciomacielbastos , it'd be really nice if you could modify the simple SIR example to use gamma priors on beta and gamma instead of lognormal.

Evaluate the performance of Gamma and log-normal priors on simulated data

We could do an experiment where we fix beta, and R0 (and thus we know gamma) and S0 and also the log-normal likelihood variance sigma_y^2. We then simulate a SIR trajectory from this process and then fit the model using (i) Gamma priors on beta and gamma and (ii) moment-matching log-normal priors on beta and gamma. We then look at the recovery of R0both in the squared error of the posterior mean and coverage of the Bayesian credibility intervals (BCI).

Check the functions generate_trajectory_* I used to generate the prior predictives in the notebooks. They can be easily modified to generate data from fixed parameters.