Coder Social home page Coder Social logo

tlverse / tmle3shift Goto Github PK

View Code? Open in Web Editor NEW
14.0 5.0 1.0 483 KB

🎯 :game_die: Targeted Learning of the Causal Effects of Stochastic Interventions

Home Page: https://tlverse.org/tmle3shift

License: GNU General Public License v3.0

Makefile 1.02% R 93.62% TeX 5.36%
targeted-learning causal-inference machine-learning stochastic-interventions treatment-effects variable-importance marginal-structural-models

tmle3shift's Introduction

R/tmle3shift

R-CMD-check Coverage Status Project Status: Active – The project has reached a stable, usable state and is being actively developed. License: GPL v3 DOI

Targeted Learning of the Causal Effects of Stochastic Interventions

Authors: Nima Hejazi, Jeremy Coyle, and Mark van der Laan


What’s tmle3shift?

tmle3shift is an adapter/extension R package in the tlverse ecosystem that exposes support for the estimation of a target parameter defined as the mean counterfactual outcome under a posited shift of the natural value of a continuous-valued intervention, using the formalism of stochastic treatment regimes. As an adapter package, tmle3shift builds upon the core tlverse grammar introduced by tmle3, a general framework that supports the implementation of a range of TMLE parameters through a unified interface. For a detailed description of the target parameter, TML estimator, and algorithm implemented in tmle3shift, the interested reader is invited to consult Dı́az and van der Laan (2012) and Dı́az and van der Laan (2018). For a general discussion of the framework of targeted minimum loss-based estimation and the role this methodology plays in statistical and causal inference, the canonical references are van der Laan and Rose (2011) and van der Laan and Rose (2018).

Building on the original work surrounding the TML estimator for the aforementioned target parameter, tmle3shift additionally implements a set of techniques for variable importance analysis, allowing for a sequence of mean counterfactual outcomes, estimated under a sequence of posited shifts, to be summarized via a working marginal structural model (MSM). The goal of this work is to build upon the tlverse framework and the estimation methodology implemented for a single mean counterfactual outcome in order to introduce an end-to-end methodology for variable importance analyses.


Installation

You can install the development version of tmle3shift from GitHub via remotes with

remotes::install_github("tlverse/tmle3shift")

Issues

If you encounter any bugs or have any specific feature requests, please file an issue.


Contributions

Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.


Citation

After using the tmle3shift R package, please cite the following:

    @software{hejazi2021tmle3shift-rpkg,
      author = {Hejazi, Nima S and Coyle, Jeremy R and {van der Laan}, Mark
        J},
      title = {{tmle3shift}: {Targeted Learning} of the Causal Effects of
        Stochastic Interventions},
      year = {2021},
      howpublished = {\url{https://github.com/tlverse/tmle3shift}},
      note = {{R} package version 0.2.0},
      url = {https://doi.org/10.5281/zenodo.4603372},
      doi = {10.5281/zenodo.4603372}
    }

Related

  • R/txshift - An R package providing an independent implementation of the TML estimation procedure and statistical methodology as is made available here, without reliance on the tlverse grammar provided by tmle3.

Funding

The development of this software was supported in part through a grant from the National Institutes of Health: T32 LM012417-02.


License

The contents of this repository are distributed under the GPL-3 license. See file LICENSE for details.


References

Dı́az, IvΓ‘n, and Mark J van der Laan. 2012. β€œPopulation Intervention Causal Effects Based on Stochastic Interventions.” Biometrics 68 (2): 541–49.

β€”β€”β€”. 2018. β€œStochastic Treatment Regimes.” In Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies, 167–80. Springer Science & Business Media.

van der Laan, Mark J, and Sherri Rose. 2011. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media.

β€”β€”β€”. 2018. Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Science & Business Media.

tmle3shift's People

Contributors

jeremyrcoyle avatar nhejazi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

guhjy

tmle3shift's Issues

Variable importance for nominal variables with few categories

Theoretically, it should be sound to perform variable importance assessment based on a grid of counterfactual shift values with nominal variables; however, in practice, such variables (even when converted via as.numeric) have few unique values. This leads to a downstream bug due to sl3's Variable_Type where the nominal variables are categorized as categorical rather than continuous. This bug is non-trivial to track down and can be distressing to users. A simple but naive solution is to add mean-zero noise to nominal variables such that there appear to be more than 20 or so unique values, as this is sufficient to trick sl3 into recognizing the variable as continuous. For example, in the following variable u has only 4 (ordered) categories but will be recognized as categorical:

n <- 10000
u_idx <- runif(n)
u <- rep(NA, n)
u[u_idx <= 0.1] <- "A"
u[u_idx > 0.1 & u_idx <= 0.3] <- "B"
u[u_idx > 0.3 & u_idx <= 0.95] <- "C"
u[u_idx > 0.95] <- "D"
u <- as.numeric(as.factor(u))

To have it recognized as continuous, one could implement

u <- u + runif(n, -0.001, 0.001)

which will have more categories than the original u yet remain the same in expectation.

warnings in tests

I ran the test suite with options(warn = 2) and it generated some failures:

── Error (test-bound.R:61:3): bounds are being respected in submodel ───────────────────────────
Error in `max(Q_submodel)`: (converted from warning) no non-missing arguments to max; returning -Inf
Backtrace:
    β–†
 1. └─testthat::expect_lte(max(Q_submodel), 1 - Q_bound_level) at test-bound.R:61:2
 2.   └─testthat::quasi_label(enquo(object), label, arg = "object")
── Error (test-marginal_structural.R:171:1): (code run outside of `test_that()`) ───────────────────────────
Error in `learner$predict_fold(learner_task, fold_number)`: (converted from warning) Lrnr_density_semiparametric_NULL_NULL is not cv-aware: self$predict_fold reverts to self$predict
Backtrace:
     β–†
  1. └─R6 fit_tmle3(tmle_task, targeted_likelihood, msm, updater) at test-marginal_structural.R:171:0
  2.   └─tmle3 initialize(...)
  3.     └─private$.tmle_fit(max_it)
  4.       └─self$updater$update(self$likelihood, self$tmle_task)
  5.         └─base::lapply(...)
  6.           └─tmle3 FUN(X[[i]], ...)
  7.             └─tmle_param$estimates(tmle_task, update_fold)
  8.               └─self$clever_covariates(tmle_task, fold_number)
  9.                 └─self$observed_likelihood$get_likelihoods(...)
 10.                   └─self$get_likelihood(tmle_task, nodes[[1]], fold_number)
 11.                     └─self$initial_likelihood$get_likelihood(tmle_task, node, fold_number)
 12.                       └─likelihood_factor$get_likelihood(tmle_task, fold_number)
 13.                         └─self$get_density(tmle_task, fold_number)
 14.                           └─learner$predict_fold(learner_task, fold_number)
── Error (test-stratified_intevention.R:61:1): (code run outside of `test_that()`) ───────────────────────────
Error in `ED * private$.targeted_components`: (converted from warning) longer object length is not a multiple of shorter object length
Backtrace:
    β–†
 1. └─R6 fit_tmle3(tmle_task, targeted_likelihood, tmle_param, updater) at test-stratified_intevention.R:61:0
 2.   └─tmle3 initialize(...)
 3.     └─private$.tmle_fit(max_it)
 4.       └─self$updater$update(self$likelihood, self$tmle_task)

It might be best to design the tests or re-write the underlying code so as not to produce these warnings.

Implementing shift guards that always move intervention

A proposal for a stochastic intervention that moves the natural value of the treatment as much as is possible, given the observed data, exists. Implementing such a shift requires that the ratio of the post-intervention treatment density to the empirical treatment density be evaluated and a maximum (in magnitude) shift identified for each stratum defined by the baseline covariates. An initial implementation of this is available on this branch but the efficiency of this implementation needs improvement.

Compatible with tmle3mediate?

Hello @nhejazi, I'm new to TMLE and attempting mediation analysis with continuous data. I came across your packages, tmle3mediate and tmle3shift, and believe they could be valuable for my work. However, I've encountered an issue with their support for mediators or continuous data, as they require generating a Spec term for each. It seems that I'm unable to include both Specs in a single estimation task. If this is correct, do you have any guidance on capturing causal effects in such scenarios?
Causal

Comparing two stochastic interventions

It would generally be of interest to be able to effectively compare the counterfactual outcome under two posited values of a stochastic intervention. This would be a rather simple application of the delta method, similar to how the ATE is computed based on a use of Param_delta for two treatment-specific means.

shift functions that respect bounds

  • new shift-guard function on the add-shift-guard branch implements the finding of bounds for the shifted treatment wrt to baseline covariates W, as given in the relevant book chapter
  • two new shift functions, similar to shift_additive and shift_additive_inv, that work in a similar way but also respect the bounds provided by shift-guard

Loose test against classic implementation

The txshift package provides an implementation of this same estimation procedure in the case of a simple shift (i.e., shift_additive) without reliance on tmle3 machinery. The independent procedure is used as the basis for this test, which has been loosened as of f83eabf. This test was previously passing under the (original) more stringent criteria as of 355c29f. It seems rather unlikely that a change in this package would have caused this test to break (since no commits near or after 355c29f altered shift_additive); instead, it's likely this was caused by updates to dependencies, which do not currently run reverse dependency checks. This should be further investigated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.