Coder Social home page Coder Social logo

pymc-labs / causalpy Goto Github PK

View Code? Open in Web Editor NEW
834.0 834.0 53.0 202.58 MB

A Python package for causal inference in quasi-experimental settings

Home Page: https://causalpy.readthedocs.io

License: Apache License 2.0

Python 99.82% Makefile 0.18%
causal-inference pymc quasi-experimental quasi-experiments

causalpy's Introduction

PyMC Labs

Connect with us:

causalpy's People

Contributors

agroutsmith avatar alexmalins avatar anevolbap avatar drbenvincent avatar jpreszler avatar jsakv avatar juanitorduz avatar krz avatar lucianopaz avatar maresb avatar nathanielf avatar oriolabril avatar parthjohri avatar pre-commit-ci[bot] avatar ronanlaker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

causalpy's Issues

RDD: rethink plot outputs

Current plots are like this
Screenshot 2022-11-02 at 22 31 34

Instead, the plots tend to look like this:
Screenshot 2022-11-02 at 22 36 06
and
Screenshot 2022-11-02 at 22 37 49

TODO:

  • the focus in RDD is just on the discontinuity, so it makes less sense to plot the counterfactual (dashed line) and to have the filled causal impact region.
  • In which case it also doesn't make much sense to have the bottom panel
  • Instead the focus should be on the discontinuity magnitude

RDD: in-depth notebook

Create an in-depth notebook to illustrate regression discontinuity

  • Include model details and maths
  • Illustrate the API
  • Illustrate nuances of the approach
  • Examples with multiple datasets

replace PyMC linear model with Bambi model

We are not removing custom PyMC models. It makes a lot of sense to be able to write custom PyMC models, for maximum flexibility.

But for the majority of cases, a linear model will be used. Because of this, it doesn't make sense to duplicate all the work that Bambi does in terms of specifying custom priors and handling hierarchical model formulae.

So need to figure out how to use Bambi models instead in addition.

  • To fit with the scikit-learn API we need to be able to pass in a blank model object, so the creation of that will have to happen behind the scenes. So maybe we have a PymcModel class as a wrapper around a Bambi model.

  • A better idea would be to have ModelBuilder subclass the Bambi model class, not pm.Model.

develop the approach for Bayesian models

Need to work on the class structure to get this working smoothly.

  • get sampling working... shape problems
  • plot data + model outputs
  • calculate & plot causal impact
  • calculate & plot cumulative causal impact
  • complete InterruptedTimeSeries experiment + notebook example
  • complete DifferenceInDifferences experiment + notebook example
  • complete RegressionDiscontinuity experiment + notebook example
    • different colours for pre and post model predictions + add labels

Save demo images in svg format

Export from the notebooks in svg format. Could likely reduce role sizes. Double check I can embed svg files in the readme.

Add linear model for interrupted time series

  • code to generate simulated data
  • example notebook
  • Quick and dirty implementation

Update

Originally this was a more complex time series with a seasonal component. But I we need a much simpler example. So this will now be a simple linear trend with no seasonality or complex temporal component.

Add tests

  • integration tests for sklearn examples in the docs
  • integration tests for pymc examples in the docs. If these are slow we might not want to run these tests remotely for every push? (addressed by #119)
  • bunch of unit tests Maybe best wait until the code base is more stable. Otherwise it becomes harder to make change.
  • test for custom exceptions etc Not specific enough
  • test we can load in the datasets (addressed by #119)

SC: check robustness of results (frequentist)

I've experienced clearly sub-optimal weightings when running the the WeightedProportion custom scikit-learn model. It is likely due to bad optimisation, perhaps getting stuck by local optima. So we need to explore the dependence of the results upon w_start.

def fit(self, X, y):
w_start = [1 / X.shape[1]] * X.shape[1]
coef_ = fmin_slsqp(
partial(self.loss, X=X, y=y),
np.array(w_start),
f_eqcons=lambda w: np.sum(w) - 1,
bounds=[(0.0, 1.0)] * len(w_start),
disp=False,
)
self.coef_ = np.atleast_2d(coef_) # return as column vector
self.mse = self.loss(W=self.coef_, X=X, y=y)
return self

One way to approach making the results more reliable (more likely to represent the global minimum) is to use a particle swarm type approach where we run the optimisation multiple times, each with different w_start.

  • Look into the relevant fitting procedures in scikit-learn.

DiD: in-depth notebook

Create an in-depth notebook to illustrate difference in differences

  • Include model details and maths
  • Illustrate the API
  • Illustrate nuances of the approach
  • Examples with multiple datasets

ITS: in-depth notebook

Create an in-depth notebook to illustrate interrupted time series

  • Include model details and maths
  • Illustrate the API
  • Illustrate nuances of the approach
  • Examples with multiple datasets
  • Include a time series model (see #4)

Bayesian model averaging

Suggestion from @ricardoV94: In situations where there are multiple valid models, then we either have to pick what model we want to use, or we can do Bayesian model averaging. So you can just fit both model, do model comparison which gives the model weightings, then we can generate model averaged predictions.

I think this was done as posterior_predictive_w (or similar) in PyMC3, but was not ported to v4.

clean up data generation code

  • remove from demo notebooks
  • move simulate_data.py into data folder
  • move statsmodels dependency into requirements-docs.txt

DiD: quantitative outputs of results

  • Add example plot of posterior distribution of the causal impact.
  • Allow user to get quantitative text report/output on the causal impact.
  • Add HDI info to causal effect in figure title
  • This should be based on posterior, not posterior predictive

fix error plotting from seaborn

In the example notebooks... Error when calling Seaborn plot code.

ValueError: Could not interpret value `y` for parameter `y`

Maybe related to my Seaborn version?

add control units to synthetic design plot

It would improve the plot if we add the untreated units to the plot (e.g. in light grey).

This will deviate from the plot method in the TimeSeriesExperiment class. So it's probably best to override this plot method where we call the superclass method then additionally plot the untreated units.

Add quantitative evaluation of the model fit

Suggestion by @ricardoV94. At the moment, users would test how well the model fits pre-treatment data visually. But we should add quantitative metrics.

This could happen in the fit method. So override the ModelBuilder.fit method:

  1. Call super().fit()
  2. Call new quantitive fit evaluation function.
  • complete on the scikit-learn models
  • complete on the Bayesian models

improve synthetic control example

At the moment we use sklearn.linear_model.LinearRegression, but that is bad because: a) we can overfit, b) regression coefficients could be negative.

What we really want is to constrain coefficients to be positive and to have some kind of penalty on the weights.

We could try

  • sklearn.linear_model.Ridge with positive=True
  • sklearn.linear_model.Lasso with positive=True

SC: in-depth notebook

Create an in-depth notebook to illustrate synthetic control

  • Include model details and maths
  • Illustrate the API
  • Illustrate nuances of the approach
  • Examples with multiple datasets

Error when `treated` variable dtype is integer

Known problem for regression discontinuity, possibly for other experiments...

When the treatment column data is integer (0/1) then we get an error, it currently only works when the dtype is boolean

CausalPy logo

Size/aspect ratio specs for digital:

  • GitHub social preview: 1280×640px
  • Twitter:
  • LinkedIn:

TODO

  • Liaise with Thomas to get one designed.
    • send as pdf
    • some stuff happens here
  • Add to GitHub social preview
  • Add to README
  • Add to readthedocs

Add Bayesian R2 metric

At the moment we have the $R^2$ point estimate from the posterior median, but it would be better to compute the $R^2$ distribution over the whole posterior and to report the HDI's.

update code to use `ModelBuilder`

ModelBuilder is currently in pymc-experimental but it will be merged into PyMC soon.

Change the code around to use ModelBuilder. This repo will then supply a couple of pre-built models, but it also means users can use the ModelBuilder class to make their own models.

flesh out the README

  • Add brief descriptions of synthetic control and interrupted time series, their similarities and differences, when you would use one or the other.
  • Add section on learning resources
  • Add section on related packages

SC + ITS: quantitative outputs

Need to provide quantitative outputs/reports for synthetic control and interrupted time series.

The Causal Impact package provides these summary stats:
Screenshot 2022-11-19 at 12 49 32

For the frequentist version: Add ability to test for presence/absence of causal impact. There is a traditional way of doing this, but we could also envisage bootstrap on the pre-intervention data.

TODO

  • add summary stats for Bayesian ITS + SC (first pass)
  • add relative impact as well as absolute
  • Implement a better API
  • add summary stats for Frequentist ITS + SC

regression discontinuity: add quantitative output of the discontinuity estimate

  • Evaluate the model prediction either side of the threshold and report the discontinuity value.
  • Remove evaluation of the causal impact and cumulative causal impact. This makes less sense in the RD setting.
  • (Bayesian model) It makes more sense to be plotting the model expectation, not the posterior predictive. Similarly, we should calculate the discontinuity at the threshold from mu, not the yhat
  • Add method which prints text summary output about the discontinuity at the threshold
  • Improve/finish summary method for the Bayesian model. Might be best to create some get methods to avoid repeating this task multiple times.
  • Bayesian model: bring back a subplot, but this time plot the posterior distribution of the discontinuity at threshold. Or maybe best to create a separate plot function for this.

ATE, CATE, ATT, ATC

This issue will likely be touched by a number of other issues as we flesh out the quantitative outputs and work through more examples. But it is important to go beyond the slightly vague 'causal impact' terminology to be more specific about:

  • Average Treatment Effect (ATE)
  • Conditional Average Treatment Effect (CATE)
  • Average Treatment Effect on the Treated (ATT)
  • Average Treatment Effect on the Control (ATC)

add an example to demonstrate lack of causal impact

At the moment, all the examples show very clear causal impacts. But it would be nice to add an example without any causal impact, particularly if it demonstrates how one can be fooled into thinking there is an effect when there is not.
(Suggestion by @ricardoV94)

regression discontinuity: allow the treatment to be `>=` or `<=` the threshold

At the moment, the assumption is that the units above the threshold are treated. But this absolutely is not always going to be true. So we need to allow for this.

Option 1: Setting a threshold_function='<=' or threshold_function='>='
Option 2: allow users to use a kwarg where they can override a function. Eg. threshold_function=np.greater_equal or threshold_function =np.less_equal

Do this on the synthetic regression discontinuity datasets, for both PyMC and skl. Append it as another analysis example.

Things to think about:

  • Helper function _is_treated uses np.greater_equal
  • We have a treated column in the dataset. This presents some redundancy because all we need is the running variable and the _is_treated helper function. That function is there because we need a way of working out which data are treated when we interpolate for xpred. One solution would be to remove treated as a column of data and instead derive this from the running variable and _is_treated. However, the treated still needs to appear in the model formula. So would have to add some explanatory text in notebooks.
  • The order of comparison to calculate discontinuity_at_threshold
  • Would be a good idea opportunity to add some input validation for RD (see #78)
  • Update the integration tests

[Optional] Do we want to add in a shaded region above/below the treatment threshold?

Refine synthetic control example

#14 improved the synthetic control example by moving from a linear regression model to a Ridge model (with positive weights constraint). But ideally we can use either Lasso or an actual model with positive weights that sum to a desired value (normally 1, but higher values allow for some level of extrapolation).

See the example in skl_demos.ipynb notebook

Add examples for 'classic' causal inference datasets

Suggestion by @juanitorduz... Rather than just applying the package to synthetic datasets, it would be good to apply the methods to classic datasets / causal inference problems. This also gives people some faith that the package produces sensible (or at least similar) results as other people's implementations.

Sources

RDD: drinking example

See https://matheusfacure.github.io/python-causality-handbook/16-Regression-Discontinuity-Design.html#

  • Frequentist model
  • Bayesian model
  • Add reference/details of original study

SC: Proposition 99 example

  • grab data
  • Frequentist model
  • Bayesian model

ITS: Add simple example to match the CausalImpact docs

  • Generate similar data
  • Add the example to its_pymc.ipynb
  • Add the example to its_skl.ipynb

DiD: Add the 'bank failure' dataset + analyses

  • add data
  • add Bayesian analysis
    • add case when we have multiple measurements over time (see #76)
  • add Frequentist analysis

This will almost certainly require code changes. At the moment there is a hard wired constraint that there is just a single pre and post observation

Add AR model for interrupted time series

#2 added a very simple interrupted time series example with no predictors.

But it would be good to add another example where there is more temporal structure. This would then we well suited for an actual time series model, here an AR model.

  • data generating function, generate_time_series_data (rename this)
  • create a new AutoRegressive subclass of CausalBase

TODO

  • Improve interrupted time series dataset by adding temporal structure
  • Add another dataset with seasonality
  • Implement with scikit-learn or sktime model. But pmdarima actually looks very promising. It wraps statsmodels but provides the fit/predict API.
  • Implement with pymc model

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.