Coder Social home page Coder Social logo

ddml's Issues

Saved ATE/ATET/ATEU results

All estimation results are saved in associative arrays. These are currently uniquely identified by the model name (the Mata object), the specification (number, "ss" or "mse") and the resample (number, "mn" or "md").

This isn't enough to uniquely identify the results for the interactive model because there are three flavours: ATE, ATET and ATEU. Current behaviour is to save the last one estimated, so if you e.g. estimate ATE and then ATET, the saved ATE will be overwritten.

May need to add an additional associative array key to deal with this.

Flexible IV output

Estimation output has the Y learner and D learner but is missing the DH learner. From the help file:

Min MSE DDML model, specification 7
y-E[y|X]  = Y2_pystacked_1                         Number of obs   =      2217
D-E[D|X,Z]= Dhat_pystacked_1
------------------------------------------------------------------------------
       share |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       price |  -.0444965   .0046333    -9.60   0.000    -.0535775   -.0354154
------------------------------------------------------------------------------

Error:Cross-fitting fold 1 RMSPE_cv not found

The example code provided in the "Interactive model--ATE and ATET estimation" method at the ddml help file has a mistake.

This is the example code:
'''
webuse cattaneo2, clear

global Y bweight
global D mbsmoke
global X prenatal1 mmarried fbaby mage medu
set seed 42

ddml init interactive, kfolds(5) reps(5)

ddml E[Y|X,D]: pystacked $Y $X, type(reg) methods(ols gradboost)
ddml E[D|X]: pystacked $D $X, type(class) methods(logit gradboost)

ddml crossfit

ddml estimate
ddml estimate, atet
'''

Run this code and get a ”Cross-fitting fold 1 RMSPE_cv not found“ error, probably some problems with the ols and logit parameter calls.

Too many combos

The number of combinations of learners can explode, esp when there are more than 2 conditional expectations being estimated.

We should probably have some kind of "nocombos" or "ssonly" option so that the only results that are reported are the shortstack results.

Might want to add a warning if the number of combinations is large, or maybe have the number of combinations reported by ddml describe, or maybe report it as part of the crossfitting output.

Renaming options

optimaliv to ivhd?
ddml sample to something else? (nb: both ddml init and ddml sample take reps(.) and kfolds(.) options)

Flexible IV + multiple Ds

The allcombos code for flexible IV doesn't work properly when there are multiple endogenous regressors.

The problem is that the D and D_h learners need to be paired together, but this isn't being respected. Say there are 2 D variables, D1 and D2. As written, the code can pair a learner for D1 with a D_h learner for D2.

The fix for this would be a bit messy, and multiple endogenous regressors is not a common specification, so for now I've added a check that disallows multiple D variables with flexible IV.

How to interpret the coefficients of the ddml command output table?

Below is the code and output table for the analysis using the ddml command. What is the meaning of the coefficients of the variables in the output table? How can they be interpreted?

*varsVG9
local varsVG9 Per_2021_w co_live disease nhQ6a_1_c1 nhQ1a2 nhQ1a3 nhQ1a6 village_kind cjtotal2021_h cjedu_sec

global D1 nhkind2
global D2 cjgovin
global Y Targeting_Errors_Esub
global X `varsVG9'

set seed 44

ddml init partial, kfolds(4) 

local trees = 500

ddml E[Y|X]: pystacked $Y $X, type(class) method(rf) cmdopt1(n_estimators(`trees'))

ddml E[D|X]: pystacked $D1 $X, type(class) method(rf) cmdopt1(n_estimators(`trees'))
ddml E[D|X]: pystacked $D2 $X, type(reg) method(rf) cmdopt1(n_estimators(`trees'))


ddml crossfit
ddml estimate, robust
Targe~s_Esub Coefficient Robust std. err. z P>z [95% conf. interval]
nhkind2 -.0457739 .0213447 -2.14 0.032 -.0876087 -.0039391
cjgovin -.0809603 .0389666 -2.08 0.038 -.1573335 -.0045871
_cons -.0108885 .0074808 l -1.46 0.146 -.0255505 .0037736

Allow for non-binary D with `interactiveiv`

With a multivalued ordered treatment (D), the interactiveiv estimator can be interpreted as estimating the average causal response introduced in Angrist and Imbens (1995, JASA). This follows from arguments in Frolich (2007, JoE).

Currently, an error is thrown if D is not binary:

. ddml crossfit, shortstack finalest(nnls1) nostdstack
error - interactiveiv model supported only for D=0 or D=1

It would be better not to throw an error.

@thomaswiemann and I have discussed this for the ddml R package and the change was made a couple of months ago and uploaded to CRAN.

how to predict in test partition

Hi, It is not clear to me how to predict in a test partition (svar==2).

Tried unsuccessfully:

. ddml init partial if svar ==1, kfolds(2)
warning - model m0 already exists
all existing model results and variables will
be dropped and model m0 will be re-initialized

. ddml E[Y|X]: pystacked $Y `covars', type(reg) method(rf)
Learner Y1_pystacked added successfully.

. ddml E[D|X]: pystacked $D `covars', type(reg) method(rf)
Learner D1_pystacked added successfully.

. ddml crossfit
Cross-fitting E[y|X] equation: sales
Cross-fitting fold 1 2 ...completed cross-fitting
Cross-fitting E[D|X] equation: price
Cross-fitting fold 1 2 ...completed cross-fitting

. predict double yhat if svar ==2
error: data in memory has changed since last -pystacked- call
you are not allowed to change data in memory between -pystacked- fit and -predict-
r(198);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.