Coder Social home page Coder Social logo

pbph's People

Contributors

josherrickson avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Forkers

arthurshem86

pbph's Issues

Use chkDots

R 3.3 introduced a new chkDots() function that will warn on arguments passed via ... if no such arguments are supported.

Adding this functionality is a bit safer, but will require bumping version requirement of R to 3.3 (or putting a check around the call to chkDots()).

Better disjoint CI handling

Right now, disjoint CI's are returned like finite CI's. Only if returnShape = TRUE is there any indication.
In the paper, we recommend considering disjoint CI's as infinite.

Possible approaches:

  1. Make disjoint CI's Inf unless returnShape = TRUE is specified.
  2. Make disjoint CI's Inf unless some new option disjointInf = FALSE or something.
  3. Display warning/note on disjoint.
  4. Modify confint outcome to allow the disjoint form (probably not the best option as it would drastically differ from other confint displays).

document, potentially refine how `epb::sandwich` sets `n`

With cluster random assignment, the effective sample size is more nearly determined by the number of clusters than by the number of elements within clusters. Statisticians who are aware of this should expect to see it reflected in the calculations. However, in R/sandwich.R as it stands now (f7fb70b on Mar 22), the clusters don't appear to be informing the scaling constants: n is being defined as n <- NROW(sandwich::estfun(x)), just as if there were no clusters.

Perhaps this is as it should be: assuming that bread() will have scaled its A matrices by the reciprocal of the number of elements, then this mischief is undone by premultiplying both the bread matrix and the $A^{-1} B A^{-t}$ sandwich itself by that same factor; for software purposes it's best to stick closely to the sandwich package's API. It would be helpful to the end user to leave a trail of breadcrumbs (so to speak) leading to this conclusion.

Relatedly, I don't imagine that the n/(n - k) degrees of freedom adjustment to the bread is the correct one when there are clusters and n refers to the number of elements. The simplest or most accepted cluster-aware alternative that presents itself in (your nonrandom sample of) the cluster-robust standard error literature would be an improvement here, if only for the purpose of emphasizing to the reader of the code that the code really is cluster-aware. (If there's no one proposal for simplest or most accepted cluster-aware d.f. adjustment, pick one.)

prevent infinite CIs (a conjecture)

To estimate the SE of $\hat\eta$ we evaluate $B(\lambda)^{-1}$ at $\tilde{\lambda} = (\hat{\beta}_c, \tilde{\tau}_0, \eta_0)$, with no restriction placed on the restricted estimator $\tilde{\tau}_0$ of $\tau$. I conjecture that $\tilde{\tau}_0$ were constrained to fall in an interval $I$, then the CI would cease to be infinite, provided only that the interval were finite and pre-determined.

In practice one will have estimated $\tau$ and figured a confidence interval for it before one gets around to doing anything with $\eta$. If one wants to report a 95% CI for $\eta$, then one might start by retrieving the previously determined 99% CI for $\tau$. One then sets $I$ to that interval, in the process "spending" 1% of one's 5% testing budget for hypotheses about $\eta$. In order to wind up with a 95% CI for $\eta$, one would test each $H: \eta = \eta_0$ at level .04, not .05. If the conjecture is correct, one never gets an infinite CI.

bring `sandwich`, `meat` into RItools?

@josherrickson , could I interest you in bringing your cluster-aware sandwich functions into RItools? We could make use of them for various purposes over there.

A couple things I'd suggest doing in the process:

  1. How about enabling the adjust= argument with a non-null cluster= argument also? (Maybe while changing adjust= to default to TRUE.)
  2. the adjustment factor is non-obvious. Is there a citation you could put for it? If so I'd recommend adding it to the docs for meat.
  3. I'm a little uncomfortable with exporting functions bearing the same name as functions in the sandwich package itself. Perhaps the package could export aliases of the two functions, eg "sandwich_clus" & "meat_clus", or "sandwichc"/"meatc"?
  4. Add yourself as author of the two functions.

consider using clubSandwich

Issue #7 brought up the existence of clubSandwich. Potentially implement in the future. For the record, type = "CR1S" is equivalent to the adjustment pbph currently makes.

Pros:

  • Much further developed
  • Implements multiple adjustments

Cons:

  • pbph requires creating bread and meat separately; its not immediately obvious if clubSandwich allows this. Look into whether that could be worked around (I believe we'd always need B22, but maybe its worth not using sandwich or clubSandwich for that piece if it eases other calculations?). Edit 9/22/16 - Issue jepusto/clubSandwich#9 brought this to the authors attention and it was addressed.
  • Almost non-existent documentation (as of 7/18/16) Edit 9/22/16 - Documentation is much improved.
  • Not on CRAN (as of 7/18/16) (not an issue now, but might be if this gets pushed to CRAN) Edit 9/22/16 - on CRAN.

If it is decided not to lean on clubSandwich, then cleaning up the adjust argument is needed (See #7 for discussion).

Facilitate $B_{21}$ matrix construction via `type='gradient'` option for `optmatch::scores()`

Looking at the two-stage sandwich variance formulas as they apply in the case of glms and/or clusters, $B_{21}$ is the one component that wouldn't have appeared in either a first-stage only sandwich or a second-stage-only sandwich. So those other components will be much easier to extract from sandwich-type functions built with a single stage of estimation in mind. The one piece that calls for special casing specific to the problem of stitching the two stages together is $B_{21}$.

Looking at formulas for $B_{21}$ in the "pb2" paper draft, $\dot{h}(X_i \beta_c)$ comes up in several places. If we had a convenient way to produce the matrix $\dot{h}(\mathbf{X}\beta_c)$, the rest of constructing $B_{21}$ would be easy. One reason this isn't convenient in itself is that it calls for combining the data frame used by model 2 (in the epb context, a treatment group only data frame) with a fit extracted from model 1 (in the epb context, a control group only data frame). The optmatch::scores function was written with precisely this type of scenario in mind.

optmatch::scores wraps to predict, whose glm method supports type options selected from c("link", "response", "terms"). This issue is suggesting an additional type, "gradient", that could be invoked to fill this role.

Supporting this functionality by way of an extension to optmatch::scores(), as opposed to functions developed more specifically for the epb problem, could come in handy in some other problems with stacked estimating equations: simple PB; estimators using various types of inverse probability weighting, with probabilities estimated from a superset of the analytic sample itself, and so on.

Re-work call hacking

Occasionally getting the following error, mostly when running automated scripts (e.g. in a make call for a paper.)

Error in eval(expr, envir, enclos) : object 'y_t - pred' not found

I'm unable to replicate in an interactive session so far.

Possible solution is to stop editing the call, and instead modify summary.pblm to stop relying so heavily on summary.lm.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.