CausalKit

A Package for Causal Inference inspired by econometrics and modern ML API design

CausalKit is a Python package designed for students and researchers alike. It offers a simple approach to economic and statistical analysis, emphasizing ease of use, interpretability and causation.

Features

Simple API with r like syntax for linear regression
Model display functions for easy comparison of models or interpretation of results
Intuitive interfaces for common causal inference methods such as IV, panel methods and logit models.
Simple library architecture allowing custom models to be added easily which will leverage existing functionality such as model display functions and r like syntax.

Installation

pip install causalkit

Get Started

import pandas as pd
from src.models.linreg import LinReg

data = pd.read_csv("data.csv")
model = LinReg(df=data,
               outcome="outcome_col", 
               independent=["independant_col1",
                            "independant_col2"],
               standard_error_type='hc0')
model.summary(content_type='static')

Regression

rgonomic API commands for linear regression, with support for fixed effects, IV, and more.

To use all columns in a dataset simply call:

import pandas as pd
from src.models.linreg import LinReg

data = pd.read_csv("data.csv")
model = LinReg(df=data,
               outcome="outcome",
               independent=["."])

Please see the 2__regression_commands_walkthrough.ipynb in /notebooks for more details & functionality.

Stargazer based regression outputs for easy comparison of models.

import pandas as pd
from src.models.linreg import LinReg
from src.displays.display_linear import display_models

data = pd.read_csv("data.csv")
model = LinReg(df=data,
               outcome="outcome_col", 
               independent=["independant_col1",
                            "independant_col2"],
               standard_error_type='hc0')

model_2 = LinReg(df=data,
               outcome="outcome_col",
               independent=["independant_col1",
                            "independant_col2",
                            "independant_col3"],
               standard_error_type='hc0')

display_models([model, model_2])

Interactive regression outputs to remind you what a stats term means. Just call the .summary() method on your model object and hover over the terms to see their definitions.

Upcoming Features:

Dropping of na values automatically upon model instantiation
IV Regression with angrist data as example
T test for difference in means
Random effect and mixed effects models
Common model diagnostics + assumption checks (linearity, normality of residuals, homoscedasticity, and absence of multicollinearity. This could include plots (like QQ plots, residual vs. fitted value plots) and statistical tests.)
Regularization (Lasso, Ridge, Elastic Net)
GLM's (poisson, negative binomial, multinomial)
Bootstrap + resampling methods (bootstrap se ci, permutation tests)
build more Causal inferenece methods interfaces (matching, regression discontinuity, synthetic control, etc.)
Interactive interaction explorer (interactive visualization with sliders for continuous variables and dropdowns for categorical variables)
Refactor Fixed Effects class to handle high dimensional fixed effects efficiently, add method to support within estimator, integrate tabmat library for efficient matrix operations.

mats-dodd / causalkit Goto Github PK

causalkit's Introduction

CausalKit

A Package for Causal Inference inspired by econometrics and modern ML API design

Features

Installation

Get Started

Regression

Upcoming Features:

causalkit's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent