wlattner / hete Goto Github PK
View Code? Open in Web Editor NEWHeterogeneous Treatment Effects
License: MIT License
Heterogeneous Treatment Effects
License: MIT License
Explore ways in which users can dive into the inner workings of the model. Options here would be tools like lime, pdp, and ice.
This is another transformed outcome method:
hete_flip_trick <- function(fold_id, x, y, tmt, folds) {
x_train <- x[folds != fold_id, ]
x_test <- x[folds == fold_id, ]
y_train <- y[folds != fold_id]
y_test <- y[folds == fold_id]
tmt_train <- tmt[folds != fold_id]
tmt_test <- tmt[folds == fold_id]
# the "flip trick", see "Uplift modeling for clinical trial data"
y_star_train <- ifelse(y_train == tmt_train, 1, 0)
m <- gbm.fit(x_train, y_star_train, distribution = "bernoulli", n.trees = 1000)
test_preds <- predict(m, as.matrix(x_test), type = "response",
n.trees = m$n.trees)
# un-flip
test_preds <- (2*test_preds) - 1
test_df <- data.frame(
predicted_te = test_preds,
observed_y = y_test,
treatment = tmt_test
)
return(test_df)
}
Implement a plot method for hete_model
with an option to plot the uplift curve or the binned treatment effect. To do this, the original training data may need to be saved with the model object.
Add a method or two for generating synthetic data. Most papers, especially the ones proposing ensemble methods include a simulation study evaluating the algorithm's performance in cases where the researcher actually knows the true treatment effect for each unit in the training data.
in uplift.R:
random_lift <- ate * frac
# we want to order the scores from highest to lowest
qts <- stats::quantile(pred_te, probs = rev(frac))
model_lift <- purrr::map_dbl(qts, model_lift, y = y, tmt = tmt, pred_te = pred_te)
# the first one must be 0
model_lift[1] <- 0
We should also multiply model_lift
by the population fraction.
This is the typical behavior of models in R.
This model fits a total of four models. We currently accept a single base estimator and use this for all four steps. One benefit mentioned in the paper is the ability to use different models for each of the steps, using a more flexible models for the treatment condition with more units for example.
https://github.com/tidyverse/broom
tidy
: return the uplift curveglance
: return the predicted ate and the q scoreaugment
: add .pred_te
column to dataframeUse consistent names and abbreviations, tmt
for treatment, ctl
for control, te
for treatment effect, est
for estimator.
The examples for hete_single
has the incorrect parameter order for uplift
, it should be uplift(outcome, treatment, predicted)
.
For some methods such as hete_split
and hete_single
the type of outcome does not matter much, the behavior is determined by the model supplied by the user. hete_x
when used for a binary outcome, needs to be provided with two binary models and two continuous models. The first step models the response in the treatment and control groups, the second step models the treatment effect in the two groups. The behavior of hete_tot
is also different between the two tasks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.