koalaverse / sure Goto Github PK
View Code? Open in Web Editor NEWSurrogate residuals for cumulative link and general regression models in R
Home Page: https://koalaverse.github.io/sure/index.html
Surrogate residuals for cumulative link and general regression models in R
Home Page: https://koalaverse.github.io/sure/index.html
Bring in @Andy-McCarthy's plotting functions. Following base R and ggplot2 convention, these should look similar to the following:
library(MASS)
fit <- polr(y ~ x, data = d, method = "logistic")
plot(fit, which = 1:2) # base plotting method
autoplot(fit, which = "qq") # ggplot2 method
Idea is to make it easier for users to construct their own fitted-vs-residual plots. It's capitalizes to it does not interfere with R's built-in generic and the specific packages that have methods defined.
For example:
Fitted.polr <- function(object) {
object$lp - object$zeta[1L] # Xb - a1
}
For example, plot(fit.polr, which = c(1, 3))
This will avoid R CMD CHECK warnings about undefined globals.
Something like
autoplot(fit, what = c("qq", "fitted"), ncol = 2)
The easiest way seems to be storing the plots in a list and then passing them to the grobs
argument in grid.arrange
.
plist <- list(p1, p2)
grid.arrange(grobs = plist, ncol = ncol)
Add new data sets that follow examples 5-6 in Dungang and Zhang (2017).
Now using:
#' @keywords internal
getMeanResponse.polr <- function(object) {
# qfun <- getQuantileFunction(object)
# fv <- object$fitted.values[, 1L, drop = TRUE]
# fv[fv == 0] <- 1e-05
# -qfun(fv)
object$lp - object$zeta[1L]
}
Will need to add extensive tests: testthat
for typical unit tests, and longer tests stored in a slowtests
directory that checks the package against the examples from the JASA paper.
@bradleyboehmke are you familiar with Travis CI? I can write the yaml
file to include, but you'll need to login with the AFIT-R GitHub account and flip the switch for the packages you want supported (ideally all of them). This will make sure that the packages are re-built and checked every time we push changes and make sure nothing is broken. We should also do the same with codecov.
Would it be possible to add support for ordinal models fit with the bayesian BRMS package:
# List required packages
Pkgs <- c("tidyverse", "ordinal", "brms", "rstan")
# Load packages ====
lapply(Pkgs, require, c = T)
# Load data
data <- ordinal::wine
# Model formula
fmla <- bf(Rating ~ temp + contact +temp:contact + 1|judge) )
# Compile and fit model
mod <- brm(fmla,
data = data,
family = cumulative("probit"),
inits = 0,
iter = 2000,
warmup = 1000,
chains = 2,
cores = 2,
sample_prior = TRUE,
save_all_pars = TRUE)
)
Paper was accepted, so need to make sure there are no breaking changes!
Should be simple once we write extractors for the fitted probabilities:
$$
R = S - E(S|X) = S - \sum_{i = 1}^J\left(j + 0.5\right)p\left(Y = j | X\right)
$$
autoplot
is not supporting object types. Need to change auto.default
to individual object methods (i.e. autoplot.polr
) or find another, more efficient approach.
Jittering (at least on the response scale) does show some promise for multinomial fits:
library(nnet)
data(df1, package = "sure")
fit <- multinom(y ~ x, data = df1)
# fit <- multinom(y ~ x + I(x ^ 2), data = df1)
y <- as.integer(df1$y)
s <- runif(length(y), min = y, max = y + 1)
prob <- fit$fitted.values
j <- seq_len(ncol(prob)) # j = 1, 2, ..., J
jmat <- matrix(rep(j, times = nrow(prob)), ncol = ncol(prob),
byrow = TRUE) # 1, 2, ..., J; 1, 2, ..., J; ...
es <- rowSums((jmat + 0.5) * prob)
r <- s - es
scatter.smooth(df1$x, r, lpars = list(lwd = 2, col = "red"))
Should make use of the naresid
method. Something like:
if(!is.null(object$na.action)) {
res <- naresid(object$na.action, res)
}
Incorporate bulk of R Journal paper as a vignette
Seems that the surrogate response values can be used to develop a coefficient of determinations (i.e., R-squared) type statistic!
Will need to learn all the properties of these residuals. The residual-by-covariate plots don't look right for the quadratic example.
The autoplot
function doesn't work for VGAM objects (i.e., vglm and vgam fits).
Will need to use Rcpp
or vectorize the truncated distribution functions; the latter might be the simpler approach.
For example, getSurrogate
should be called within getResiduals
, rather than both duplicating the same code.
Add support for logistic, probit, loglog, cloglog, and cauchit link functions. Will need to import gumbel distribution functions from MASS
, ordinal
or VGAM
.
Write methods (i.e., "clm"
, "polr"
, and "vglm"
) for broom
s generic augment
function to add residuals to original data.
Dear prof. Greenwell,
I am trying to use “sure” package to make diagnostics on a logit ordinal model generated with “svyolr” function (“survey” R package”), in particular i am trying to use “resids” function, but i get the error “Error in eval(predvars, data, env) : object "MR" not found”, where “MR” is the response variable (ordered factor) of the svyolr model.
I tryed to assign the class “polr” to the svyolr model (that is based on polr function) but the error persists.
I apologize if the question is trivial but I can’t get rid of the error.
Yours sincerely,
Please find below an example.
Fabio Maria Colombo
library(sure)
library(polr)
library(survey)
#Simulated dataset
RM=ordered(c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
3, 3, 3, 3, 4, 4, 4, 3, 3, 4, 4, 4, 4))
data=data.frame(age=rnorm(mean=7,sd=2,49),
weight=rnorm(mean=10,sd=3,49),
nose.length=rnorm(49,mean=3,sd=1.5),
head.length=rnorm(49,mean=8,sd=2),
w=abs(1+trunc(rnorm(mean=2, sd=2,49))))
#survey design
svydesign<-svydesign(id=~1,data=data,weights=~w)
#svyolr model
model=svyolr(RM~nose.length+head.length,
design=svydesign
)
#resids(sure) function
resids(model)
class(model)='polr' #assigns the class "polr" to model
resids(model)
#Error in eval(predvars, data, env) : object "nose.length" not found
####################################################################
heart <- data.frame(
ck = 0:11*40 + 20,
ha = c(2, 13, 30, 30, 21, 19, 18, 13, 19, 15, 7, 8),
ok = c(88, 26, 8, 5, 0, 1, 1, 1, 1, 0, 0, 0)
)
mod.0 <- glm(cbind(ha,ok) ~ ck, family = binomial(link = "logit"),
data = heart)
sure::resids(mod.0)
Something like
getFittedProbs <- function(object) {
UseMethod("getFittedProbs")
}
I found that an aliased parameter will throw an error while residual/surrogate calculation. I tried:
library(ordinal)
data(wine)
fmm1 <- clm(rating ~ 1 + judge + temp, data = wine)
fmm2 <- clm(rating ~ 1 + judge + bottle, data = wine)
fmm3 <- clm(rating ~ 1 + judge + temp + bottle, data = wine)
fmm4 <- clm(rating ~ 1 + judge + temp + contact, data = wine)
sure::resids(fmm3)
sure::surrogate(fmm3)
Only fmm3 produces the error:
Error in X[, -1L, drop = FALSE] %*% object$beta : non-conformable arguments
I could bypass the problem as following: I changed the code in generate_residuals from
drop(X[, -1L, drop = FALSE] %*% object$beta - object$alpha[1L])
to
drop(X[, -1L, drop = FALSE] %*% object$beta[!is.na(object$beta)] - object$alpha[1L])
This way I get residual values. I don't know if this is legit or if it's better to throw an error saying 'there was one unidentified parameter (aliased) so residuals are not valid'. Can we use the residuals even if we drop this value?
As far as I understood your code you intended to just drop the value in
if(sum(object$aliased$beta) > 0) { X <- X[, !c(FALSE, object$aliased$beta), drop = FALSE] }
Sincerely Simon
Package name is taken on CRAN.
Will also need to eventually update the link and reference to Dungang's JASA paper.
Jittering on the response scale (jitter.scale = "response"
)seems to work fine, but seems problematic on the probability scale (jitter.scale = "probability"
).
rms
has the lrm
function for fitting proportional odds ordinal logistic regression models.
Switch to ggplot2
for "gof"
objects for consistency.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.