josherrickson / pbph Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 1.0 243 KB

R package implementing Peters-Belson with Prognostic Heterogeneity

Makefile 0.88% R 99.12%

pbph's People

Contributors

Stargazers

Watchers

Forkers

arthurshem86

pbph's Issues

Use chkDots

R 3.3 introduced a new chkDots() function that will warn on arguments passed via ... if no such arguments are supported.

Adding this functionality is a bit safer, but will require bumping version requirement of R to 3.3 (or putting a check around the call to chkDots()).

Better disjoint CI handling

Right now, disjoint CI's are returned like finite CI's. Only if returnShape = TRUE is there any indication.
In the paper, we recommend considering disjoint CI's as infinite.

Possible approaches:

Make disjoint CI's Inf unless returnShape = TRUE is specified.
Make disjoint CI's Inf unless some new option disjointInf = FALSE or something.
Display warning/note on disjoint.
Modify confint outcome to allow the disjoint form (probably not the best option as it would drastically differ from other confint displays).

document, potentially refine how `epb::sandwich` sets `n`

With cluster random assignment, the effective sample size is more nearly determined by the number of clusters than by the number of elements within clusters. Statisticians who are aware of this should expect to see it reflected in the calculations. However, in R/sandwich.R as it stands now (f7fb70b on Mar 22), the clusters don't appear to be informing the scaling constants: n is being defined as n <- NROW(sandwich::estfun(x)), just as if there were no clusters.

Perhaps this is as it should be: assuming that bread() will have scaled its A matrices by the reciprocal of the number of elements, then this mischief is undone by premultiplying both the bread matrix and the $A^{-1} B A^{-t}$ sandwich itself by that same factor; for software purposes it's best to stick closely to the sandwich package's API. It would be helpful to the end user to leave a trail of breadcrumbs (so to speak) leading to this conclusion.

Relatedly, I don't imagine that the n/(n - k) degrees of freedom adjustment to the bread is the correct one when there are clusters and n refers to the number of elements. The simplest or most accepted cluster-aware alternative that presents itself in (your nonrandom sample of) the cluster-robust standard error literature would be an improvement here, if only for the purpose of emphasizing to the reader of the code that the code really is cluster-aware. (If there's no one proposal for simplest or most accepted cluster-aware d.f. adjustment, pick one.)

prevent infinite CIs (a conjecture)

To estimate the SE of $\hat\eta$ we evaluate $B(\lambda)^{-1}$ at $\tilde{\lambda} = (\hat{\beta}_c, \tilde{\tau}_0, \eta_0)$, with no restriction placed on the restricted estimator $\tilde{\tau}_0$ of $\tau$. I conjecture that $\tilde{\tau}_0$ were constrained to fall in an interval $I$, then the CI would cease to be infinite, provided only that the interval were finite and pre-determined.

In practice one will have estimated $\tau$ and figured a confidence interval for it before one gets around to doing anything with $\eta$. If one wants to report a 95% CI for $\eta$, then one might start by retrieving the previously determined 99% CI for $\tau$. One then sets $I$ to that interval, in the process "spending" 1% of one's 5% testing budget for hypotheses about $\eta$. In order to wind up with a 95% CI for $\eta$, one would test each $H: \eta = \eta_0$ at level .04, not .05. If the conjecture is correct, one never gets an infinite CI.

bring `sandwich`, `meat` into RItools?

@josherrickson , could I interest you in bringing your cluster-aware sandwich functions into RItools? We could make use of them for various purposes over there.

A couple things I'd suggest doing in the process:

How about enabling the adjust= argument with a non-null cluster= argument also? (Maybe while changing adjust= to default to TRUE.)
the adjustment factor is non-obvious. Is there a citation you could put for it? If so I'd recommend adding it to the docs for meat.
I'm a little uncomfortable with exporting functions bearing the same name as functions in the sandwich package itself. Perhaps the package could export aliases of the two functions, eg "sandwich_clus" & "meat_clus", or "sandwichc"/"meatc"?
Add yourself as author of the two functions.

consider using clubSandwich

Issue #7 brought up the existence of clubSandwich. Potentially implement in the future. For the record, type = "CR1S" is equivalent to the adjustment pbph currently makes.

Pros:

Much further developed
Implements multiple adjustments

Cons:

pbph requires creating bread and meat separately; its not immediately obvious if clubSandwich allows this. Look into whether that could be worked around (I believe we'd always need B22, but maybe its worth not using sandwich or clubSandwich for that piece if it eases other calculations?). Edit 9/22/16 - Issue jepusto/clubSandwich#9 brought this to the authors attention and it was addressed.
~~Almost non-existent documentation (as of 7/18/16)~~ Edit 9/22/16 - Documentation is much improved.
~~Not on CRAN (as of 7/18/16) (not an issue now, but might be if this gets pushed to CRAN)~~ Edit 9/22/16 - on CRAN.

If it is decided not to lean on clubSandwich, then cleaning up the adjust argument is needed (See #7 for discussion).

Facilitate $B_{21}$ matrix construction via `type='gradient'` option for `optmatch::scores()`

Looking at the two-stage sandwich variance formulas as they apply in the case of glms and/or clusters, $B_{21}$ is the one component that wouldn't have appeared in either a first-stage only sandwich or a second-stage-only sandwich. So those other components will be much easier to extract from sandwich-type functions built with a single stage of estimation in mind. The one piece that calls for special casing specific to the problem of stitching the two stages together is $B_{21}$.

Looking at formulas for $B_{21}$ in the "pb2" paper draft, $\dot{h}(X_i \beta_c)$ comes up in several places. If we had a convenient way to produce the matrix $\dot{h}(\mathbf{X}\beta_c)$, the rest of constructing $B_{21}$ would be easy. One reason this isn't convenient in itself is that it calls for combining the data frame used by model 2 (in the epb context, a treatment group only data frame) with a fit extracted from model 1 (in the epb context, a control group only data frame). The optmatch::scores function was written with precisely this type of scenario in mind.

optmatch::scores wraps to predict, whose glm method supports type options selected from c("link", "response", "terms"). This issue is suggesting an additional type, "gradient", that could be invoked to fill this role.

Supporting this functionality by way of an extension to optmatch::scores(), as opposed to functions developed more specifically for the epb problem, could come in handy in some other problems with stacked estimating equations: simple PB; estimators using various types of inverse probability weighting, with probabilities estimated from a superset of the analytic sample itself, and so on.

Re-work call hacking

Occasionally getting the following error, mostly when running automated scripts (e.g. in a make call for a paper.)

Error in eval(expr, envir, enclos) : object 'y_t - pred' not found

I'm unable to replicate in an interactive session so far.

Possible solution is to stop editing the call, and instead modify summary.pblm to stop relying so heavily on summary.lm.

josherrickson / pbph Goto Github PK

pbph's People

Contributors

Stargazers

Watchers

Forkers

pbph's Issues

Use chkDots

Better disjoint CI handling

document, potentially refine how `epb::sandwich` sets `n`

prevent infinite CIs (a conjecture)

bring `sandwich`, `meat` into RItools?

consider using clubSandwich

Facilitate $B_{21}$ matrix construction via `type='gradient'` option for `optmatch::scores()`

Re-work call hacking

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent