pymc-labs / causalpy Goto Github PK

View Code? Open in Web Editor NEW

834.0 834.0 53.0 202.58 MB

A Python package for causal inference in quasi-experimental settings

Home Page: https://causalpy.readthedocs.io

License: Apache License 2.0

Python 99.82% Makefile 0.18%

causal-inference pymc quasi-experimental quasi-experiments

causalpy's Introduction

PyMC Labs

Connect with us:

causalpy's People

Contributors

Stargazers

Watchers

Forkers

overfittingstudyroom kimjiseong1994 radovankavicky kwazinhlaka gapdata rowan-schaefer charlesdong1991 aolifodaisy apoorvalal xkey- lucianopaz oriolabril skpalu mirekwilmer matekadlicsko shalevy1 ltttdh animesh datasciencecampus cczhgit xiemeigongzi ssghost ktzhao agroutsmith mknl-tech nathanielf astefano nkoep codehornets jpreszler alexmalins yijingyidong harrisonld drstogiehead maxmarkov jsakv rlaker tankahoy cetagostini allanbutler jaedukseo anevolbap cetagostini-wise peeeffchang vishalbelsare socialchangelab ssejal antobi joeybordon coreyabshire parthjohri

causalpy's Issues

RDD: rethink plot outputs

Current plots are like this

Instead, the plots tend to look like this:

and

TODO:

the focus in RDD is just on the discontinuity, so it makes less sense to plot the counterfactual (dashed line) and to have the filled causal impact region.
In which case it also doesn't make much sense to have the bottom panel
Instead the focus should be on the discontinuity magnitude

RDD: in-depth notebook

Create an in-depth notebook to illustrate regression discontinuity

Include model details and maths
Illustrate the API
Illustrate nuances of the approach
Examples with multiple datasets

SC: check sensitivity to prior

For Bayesian synthetic control:

How does the prior impact sampling problems?
How does the prior impact the posterior estimates?

[similar to #45]

Do this after #22

Tagging @juanitorduz

replace PyMC linear model with Bambi model

We are not removing custom PyMC models. It makes a lot of sense to be able to write custom PyMC models, for maximum flexibility.

But for the majority of cases, a linear model will be used. Because of this, it doesn't make sense to duplicate all the work that Bambi does in terms of specifying custom priors and handling hierarchical model formulae.

So need to figure out how to use Bambi models ~~instead~~ in addition.

To fit with the scikit-learn API we need to be able to pass in a blank model object, so the creation of that will have to happen behind the scenes. So maybe we have a PymcModel class as a wrapper around a Bambi model.
A better idea would be to have ModelBuilder subclass the Bambi model class, not pm.Model.

develop the approach for Bayesian models

Need to work on the class structure to get this working smoothly.

get sampling working... shape problems
plot data + model outputs
calculate & plot causal impact
calculate & plot cumulative causal impact
complete InterruptedTimeSeries experiment + notebook example
complete DifferenceInDifferences experiment + notebook example
complete RegressionDiscontinuity experiment + notebook example
- different colours for pre and post model predictions + add labels

Save demo images in svg format

Export from the notebooks in svg format. Could likely reduce role sizes. Double check I can embed svg files in the readme.

Ability to plot cumulative absolute impact

@ricardoV94 suggested that it could be useful to plot the cumulative sum of absolute impact values. This might be useful in some situations if the intervention causes an increase in volatility for example.

Add linear model for interrupted time series

code to generate simulated data
example notebook
Quick and dirty implementation

Update

Originally this was a more complex time series with a seasonal component. But I we need a much simpler example. So this will now be a simple linear trend with no seasonality or complex temporal component.

Get package up on PyPi

add version number to __innit__.py and setup.py
create setup.cfg
create manifest.in
update .gitignore to ignore the dist/ folder
add twine as a developer requirement
get it so pypi.org shows the README.md as the project description (see instructions here https://packaging.python.org/en/latest/guides/making-a-pypi-friendly-readme/)
~~add a Quickstart to the README~~
~~confirm if we want to switch from CausalPy -> cauasalpy~~

Add tests

integration tests for sklearn examples in the docs
integration tests for pymc examples in the docs. If these are slow we might not want to run these tests remotely for every push? (addressed by #119)
~~bunch of unit tests~~ Maybe best wait until the code base is more stable. Otherwise it becomes harder to make change.
~~test for custom exceptions etc~~ Not specific enough
test we can load in the datasets (addressed by #119)

SC: check robustness of results (frequentist)

I've experienced clearly sub-optimal weightings when running the the WeightedProportion custom scikit-learn model. It is likely due to bad optimisation, perhaps getting stuck by local optima. So we need to explore the dependence of the results upon w_start.

CausalPy/causalpy/skl_models.py

Lines 22 to 33 in 815c14c

    
           def fit(self, X, y): 
        
               w_start = [1 / X.shape[1]] * X.shape[1] 
        
               coef_ = fmin_slsqp( 
        
                   partial(self.loss, X=X, y=y), 
        
                   np.array(w_start), 
        
                   f_eqcons=lambda w: np.sum(w) - 1, 
        
                   bounds=[(0.0, 1.0)] * len(w_start), 
        
                   disp=False, 
        
               ) 
        
               self.coef_ = np.atleast_2d(coef_)  # return as column vector 
        
               self.mse = self.loss(W=self.coef_, X=X, y=y) 
        
               return self

One way to approach making the results more reliable (more likely to represent the global minimum) is to use a particle swarm type approach where we run the optimisation multiple times, each with different w_start.

Look into the relevant fitting procedures in scikit-learn.

DiD: in-depth notebook

Create an in-depth notebook to illustrate difference in differences

Include model details and maths
Illustrate the API
Illustrate nuances of the approach
Examples with multiple datasets

ITS: in-depth notebook

Create an in-depth notebook to illustrate interrupted time series

Include model details and maths
Illustrate the API
Illustrate nuances of the approach
Examples with multiple datasets
Include a time series model (see #4)

split the init of the ExperimentalDesigns up

Split up the following into smaller methods:

TimeSeriesExperiment.__init__()
DifferenceInDifferences.__init__()
RegressionDiscontinuity.__init__()

remove assumptions on variable names in plotting code

Bayesian model averaging

Suggestion from @ricardoV94: In situations where there are multiple valid models, then we either have to pick what model we want to use, or we can do Bayesian model averaging. So you can just fit both model, do model comparison which gives the model weightings, then we can generate model averaged predictions.

I think this was done as posterior_predictive_w (or similar) in PyMC3, but was not ported to v4.

clean up data generation code

remove from demo notebooks
move simulate_data.py into data folder
move statsmodels dependency into requirements-docs.txt

DiD: quantitative outputs of results

Add example plot of posterior distribution of the causal impact.
Allow user to get quantitative text report/output on the causal impact.
Add HDI info to causal effect in figure title
This should be based on posterior, not posterior predictive

fix error plotting from seaborn

In the example notebooks... Error when calling Seaborn plot code.

ValueError: Could not interpret value `y` for parameter `y`

Maybe related to my Seaborn version?

add control units to synthetic design plot

It would improve the plot if we add the untreated units to the plot (e.g. in light grey).

This will deviate from the plot method in the TimeSeriesExperiment class. So it's probably best to override this plot method where we call the superclass method then additionally plot the untreated units.

Add quantitative evaluation of the model fit

Suggestion by @ricardoV94. At the moment, users would test how well the model fits pre-treatment data visually. But we should add quantitative metrics.

This could happen in the fit method. So override the ModelBuilder.fit method:

Call super().fit()
Call new quantitive fit evaluation function.

complete on the scikit-learn models
complete on the Bayesian models

DiD: add causal impact subplot

Synthetic Control: add control units to Bayesian plot

pre-commit checks / code and notebook formatting

Set up pre-commit checks to enforce code and notebook formatting.

synthetic control simulated example: compare inferred causal impact to true causal impact

Add a plot where we compare the inferred causal impact to the true causal impact.

improve synthetic control example

At the moment we use sklearn.linear_model.LinearRegression, but that is bad because: a) we can overfit, b) regression coefficients could be negative.

What we really want is to constrain coefficients to be positive and to have some kind of penalty on the weights.

We could try

sklearn.linear_model.Ridge with positive=True
sklearn.linear_model.Lasso with positive=True

SC: in-depth notebook

Create an in-depth notebook to illustrate synthetic control

Include model details and maths
Illustrate the API
Illustrate nuances of the approach
Examples with multiple datasets

Error when `treated` variable dtype is integer

Known problem for regression discontinuity, possibly for other experiments...

When the treatment column data is integer (0/1) then we get an error, it currently only works when the dtype is boolean

Add brief description of the Frequentist and Bayesian output plots to the readme

update README before initial public release

Add Roadmap
Add disclaimer that this is alpha software
Update final images once #16 is finished. Also see #24
Remove links to removed notebooks
Add into about PyMC Labs

CausalPy logo

Size/aspect ratio specs for digital:

GitHub social preview: 1280×640px
Twitter:
LinkedIn:

TODO

black formatting of notebooks not happening in pre-commit checks

add black-jupyter to pre-commits
check isort working in notebooks
- update to use https://github.com/pycqa/isort. The current one listed is archived.
- remove https://github.com/asottile/seed-isort-config. This is deprecated.
- might be best to use nbqa

Add uncertainty for the traditional (non-Bayesian) models

Suggested by @juanitorduz. Would be good to get measures of uncertainty for the non-Bayesian models. Could use:

Statsmodels notes

Using R-style formulas https://www.statsmodels.org/dev/example_formulas.html (also see alternate constructor)
Out of sample prediction: https://www.statsmodels.org/dev/examples/notebooks/generated/predict.html

Add Bayesian R2 metric

At the moment we have the $R^2$ point estimate from the posterior median, but it would be better to compute the $R^2$ distribution over the whole posterior and to report the HDI's.

RDD: change examples to use same/similar model for Frequentist + Bayes

Pair of examples at the moment are jarringly different.

use `pathlib`

Suggestion by @tomicapretto

update code to use `ModelBuilder`

ModelBuilder is currently in pymc-experimental but it will be merged into PyMC soon.

Change the code around to use ModelBuilder. This repo will then supply a couple of pre-built models, but it also means users can use the ModelBuilder class to make their own models.

flesh out the README

Add brief descriptions of synthetic control and interrupted time series, their similarities and differences, when you would use one or the other.
Add section on learning resources
Add section on related packages

Add Synthetic Difference-in-Differences

Suggestion by @juanitorduz

Resources

SC + ITS: quantitative outputs

Need to provide quantitative outputs/reports for synthetic control and interrupted time series.

The Causal Impact package provides these summary stats:

For the frequentist version: Add ability to test for presence/absence of causal impact. There is a traditional way of doing this, but we could also envisage bootstrap on the pre-intervention data.

TODO

add summary stats for Bayesian ITS + SC (first pass)
add relative impact as well as absolute
Implement a better API
add summary stats for Frequentist ITS + SC

fix xaxis labels when index of time series data is datetime

At the moment the interrupted time series plot is terrible because of the major or minor gridlines. So this needs to be improved.

Remove remaining instances of “QuasPy”

There is at least one instance in the readme. To be replaced with “CausalPy”

regression discontinuity: add quantitative output of the discontinuity estimate

Evaluate the model prediction either side of the threshold and report the discontinuity value.
Remove evaluation of the causal impact and cumulative causal impact. This makes less sense in the RD setting.
(Bayesian model) It makes more sense to be plotting the model expectation, not the posterior predictive. Similarly, we should calculate the discontinuity at the threshold from mu, not the yhat
Add method which prints text summary output about the discontinuity at the threshold
Improve/finish summary method for the Bayesian model. Might be best to create some get methods to avoid repeating this task multiple times.
Bayesian model: bring back a subplot, but this time plot the posterior distribution of the discontinuity at threshold. Or maybe best to create a separate plot function for this.

ATE, CATE, ATT, ATC

This issue will likely be touched by a number of other issues as we flesh out the quantitative outputs and work through more examples. But it is important to go beyond the slightly vague 'causal impact' terminology to be more specific about:

Average Treatment Effect (ATE)
Conditional Average Treatment Effect (CATE)
Average Treatment Effect on the Treated (ATT)
Average Treatment Effect on the Control (ATC)

add an example to demonstrate lack of causal impact

At the moment, all the examples show very clear causal impacts. But it would be nice to add an example without any causal impact, particularly if it demonstrates how one can be fooled into thinking there is an effect when there is not.
(Suggestion by @ricardoV94)

regression discontinuity: allow the treatment to be `>=` or `<=` the threshold

At the moment, the assumption is that the units above the threshold are treated. But this absolutely is not always going to be true. So we need to allow for this.

Option 1: Setting a threshold_function='<=' or threshold_function='>='
Option 2: allow users to use a kwarg where they can override a function. Eg. threshold_function=np.greater_equal or threshold_function =np.less_equal

Do this on the synthetic regression discontinuity datasets, for both PyMC and skl. Append it as another analysis example.

Things to think about:

Helper function _is_treated uses np.greater_equal
We have a treated column in the dataset. This presents some redundancy because all we need is the running variable and the _is_treated helper function. That function is there because we need a way of working out which data are treated when we interpolate for xpred. One solution would be to remove treated as a column of data and instead derive this from the running variable and _is_treated. However, the treated still needs to appear in the model formula. So would have to add some explanatory text in notebooks.
The order of comparison to calculate discontinuity_at_threshold
Would be a good idea opportunity to add some input validation for RD (see #78)
Update the integration tests

[Optional] Do we want to add in a shaded region above/below the treatment threshold?

Refine synthetic control example

#14 improved the synthetic control example by moving from a linear regression model to a Ridge model (with positive weights constraint). But ideally we can use either Lasso or an actual model with positive weights that sum to a desired value (normally 1, but higher values allow for some level of extrapolation).

See the example in skl_demos.ipynb notebook

DiD: add causal impact subplot

Add examples for 'classic' causal inference datasets

Suggestion by @juanitorduz... Rather than just applying the package to synthetic datasets, it would be good to apply the methods to classic datasets / causal inference problems. This also gives people some faith that the package produces sensible (or at least similar) results as other people's implementations.

Sources

data from the book Mastering Metrics is available here http://www.masteringmetrics.com/resources/

RDD: drinking example

See https://matheusfacure.github.io/python-causality-handbook/16-Regression-Discontinuity-Design.html#

Frequentist model
Bayesian model
Add reference/details of original study

SC: Proposition 99 example

grab data
Frequentist model
Bayesian model

ITS: Add simple example to match the CausalImpact docs

Generate similar data
Add the example to its_pymc.ipynb
Add the example to its_skl.ipynb

DiD: Add the 'bank failure' dataset + analyses

add data
add Bayesian analysis
- add case when we have multiple measurements over time (see #76)
add Frequentist analysis

This will almost certainly require code changes. At the moment there is a hard wired constraint that there is just a single pre and post observation

Add AR model for interrupted time series

#2 added a very simple interrupted time series example with no predictors.

But it would be good to add another example where there is more temporal structure. This would then we well suited for an actual time series model, here an AR model.

data generating function, generate_time_series_data (rename this)
create a new AutoRegressive subclass of CausalBase

TODO

Improve interrupted time series dataset by adding temporal structure
Add another dataset with seasonality
Implement with scikit-learn or sktime model. But pmdarima actually looks very promising. It wraps statsmodels but provides the fit/predict API.
Implement with pymc model

	def fit(self, X, y):
	w_start = [1 / X.shape[1]] * X.shape[1]
	coef_ = fmin_slsqp(
	partial(self.loss, X=X, y=y),
	np.array(w_start),
	f_eqcons=lambda w: np.sum(w) - 1,
	bounds=[(0.0, 1.0)] * len(w_start),
	disp=False,
	)
	self.coef_ = np.atleast_2d(coef_) # return as column vector
	self.mse = self.loss(W=self.coef_, X=X, y=y)
	return self

pymc-labs / causalpy Goto Github PK

causalpy's Introduction

PyMC Labs

Connect with us:

causalpy's People

Contributors

Stargazers

Watchers

Forkers

causalpy's Issues

Statsmodels notes

Resources

Sources

RDD: drinking example

SC: Proposition 99 example

ITS: Add simple example to match the CausalImpact docs

DiD: Add the 'bank failure' dataset + analyses

Recommend Projects

Recommend Topics

Recommend Org