Plotting functions

Bring in @Andy-McCarthy's plotting functions. Following base R and ggplot2 convention, these should look similar to the following:

library(MASS)
fit <- polr(y ~ x, data = d, method = "logistic")
plot(fit, which = 1:2)  # base plotting method
autoplot(fit, which = "qq")  # ggplot2 method

Consider adding a new Fitted() generic

Idea is to make it easier for users to construct their own fitted-vs-residual plots. It's capitalizes to it does not interfere with R's built-in generic and the specific packages that have methods defined.

For example:

Fitted.polr <- function(object) {
  object$lp - object$zeta[1L]  # Xb - a1
}

Add base plotting function

For example, plot(fit.polr, which = c(1, 3))

Use NSE in autoplot code

This will avoid R CMD CHECK warnings about undefined globals.

Allow more than one plot

Something like

autoplot(fit, what = c("qq", "fitted"), ncol = 2)

The easiest way seems to be storing the plots in a list and then passing them to the grobs argument in grid.arrange.

plist <- list(p1, p2)
grid.arrange(grobs = plist, ncol = ncol)

New data sets

Add new data sets that follow examples 5-6 in Dungang and Zhang (2017).

Extracting mean response from "polr" objects

Now using:

#' @keywords internal
getMeanResponse.polr <- function(object) {
  # qfun <- getQuantileFunction(object)
  # fv <- object$fitted.values[, 1L, drop = TRUE]
  # fv[fv == 0] <- 1e-05
  # -qfun(fv)
  object$lp - object$zeta[1L]
}

Add tests

Will need to add extensive tests: testthat for typical unit tests, and longer tests stored in a slowtests directory that checks the package against the examples from the JASA paper.

Add Travis CI support

@bradleyboehmke are you familiar with Travis CI? I can write the yaml file to include, but you'll need to login with the AFIT-R GitHub account and flip the switch for the packages you want supported (ideally all of them). This will make sure that the packages are re-built and checked every time we push changes and make sure nothing is broken. We should also do the same with codecov.

Support for Ordinal models with with BRMS

Would it be possible to add support for ordinal models fit with the bayesian BRMS package:

# List required packages 
Pkgs <- c("tidyverse", "ordinal", "brms",  "rstan")

# Load packages ====
lapply(Pkgs, require, c = T)

# Load data
data <- ordinal::wine 

# Model formula
fmla <- bf(Rating ~ temp + contact +temp:contact + 1|judge) ) 

# Compile and fit model
mod <- brm(fmla,
            data = data,
            family = cumulative("probit"),
            inits = 0,
            iter = 2000,
            warmup = 1000,
            chains = 2,
            cores = 2,
            sample_prior = TRUE,
            save_all_pars = TRUE)
        )

Rerun code from The R Journal article

Paper was accepted, so need to make sure there are no breaking changes!

Add latent method for binary GLMs

Implement jittering for all supported CLMs

Should be simple once we write extractors for the fitted probabilities:
$$
R = S - E(S|X) = S - \sum_{i = 1}^J\left(j + 0.5\right)p\left(Y = j | X\right)
$$

Fix autoplot

autoplot is not supporting object types. Need to change auto.default to individual object methods (i.e. autoplot.polr) or find another, more efficient approach.

Multinomial fits

Jittering (at least on the response scale) does show some promise for multinomial fits:

library(nnet)
data(df1, package = "sure")
fit <- multinom(y ~ x, data = df1)
# fit <- multinom(y ~ x + I(x ^ 2), data = df1)
y <- as.integer(df1$y)
s <- runif(length(y), min = y, max = y + 1)
prob <- fit$fitted.values
j <- seq_len(ncol(prob))  # j = 1, 2, ..., J
jmat <- matrix(rep(j, times = nrow(prob)), ncol = ncol(prob),
               byrow = TRUE)  # 1, 2, ..., J; 1, 2, ..., J; ...
es <- rowSums((jmat + 0.5) * prob)
r <- s - es
scatter.smooth(df1$x, r, lpars = list(lwd = 2, col = "red"))

Missing values

Should make use of the naresid method. Something like:

if(!is.null(object$na.action)) {
  res <- naresid(object$na.action, res)
}

Create vignette

Incorporate bulk of R Journal paper as a vignette

Coefficient of determination?

Seems that the surrogate response values can be used to develop a coefficient of determinations (i.e., R-squared) type statistic!

Incorporate jittering technique

Will need to learn all the properties of these residuals. The residual-by-covariate plots don't look right for the quadratic example.

Please consider upgrading to R (≥ 4.0.0)

The autoplot function doesn't work for VGAM objects (i.e., vglm and vgam fits).

Faster bootstrap code

Will need to use Rcpp or vectorize the truncated distribution functions; the latter might be the simpler approach.

Code cleanup!

For example, getSurrogate should be called within getResiduals, rather than both duplicating the same code.

Broader link function support

Add support for logistic, probit, loglog, cloglog, and cauchit link functions. Will need to import gumbel distribution functions from MASS, ordinal or VGAM.

Add support for binary glm objects

Add broom support

Write methods (i.e., "clm", "polr", and "vglm") for brooms generic augment function to add residuals to original data.

Add Dungang Liu as an author

is it possible to apply "sure " R package to ordinal models "svyolr"?

Dear prof. Greenwell,

I am trying to use “sure” package to make diagnostics on a logit ordinal model generated with “svyolr” function (“survey” R package”), in particular i am trying to use “resids” function, but i get the error “Error in eval(predvars, data, env) : object "MR" not found”, where “MR” is the response variable (ordered factor) of the svyolr model.

I tryed to assign the class “polr” to the svyolr model (that is based on polr function) but the error persists.

I apologize if the question is trivial but I can’t get rid of the error.

Yours sincerely,

Please find below an example.

Fabio Maria Colombo

library(sure)
library(polr)
library(survey)

#Simulated dataset
RM=ordered(c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
3, 3, 3, 3, 4, 4, 4, 3, 3, 4, 4, 4, 4))
data=data.frame(age=rnorm(mean=7,sd=2,49),
weight=rnorm(mean=10,sd=3,49),
nose.length=rnorm(49,mean=3,sd=1.5),
head.length=rnorm(49,mean=8,sd=2),
w=abs(1+trunc(rnorm(mean=2, sd=2,49))))
#survey design
svydesign<-svydesign(id=~1,data=data,weights=~w)

#svyolr model
model=svyolr(RM~nose.length+head.length,
design=svydesign
)
#resids(sure) function
resids(model)

Error in resids.default(model) :

model should be of class "clm", "glm", "lrm", "orm", "polr", "vgam", or "vglm".

class(model)='polr' #assigns the class "polr" to model
resids(model)
#Error in eval(predvars, data, env) : object "nose.length" not found
####################################################################

Binary GLMs when response is not a factor

heart <- data.frame(
  ck = 0:11*40 + 20,
  ha = c(2, 13, 30, 30, 21, 19, 18, 13, 19, 15, 7, 8),
  ok = c(88, 26, 8, 5, 0, 1, 1, 1, 1, 0, 0, 0)
)
mod.0 <- glm(cbind(ha,ok) ~ ck, family = binomial(link = "logit"),
             data = heart)
sure::resids(mod.0)

Proper handling of weights

Write helper functions for extracting the fitted probabilities

Something like

getFittedProbs <- function(object) {
  UseMethod("getFittedProbs")
}

Residual generation fails when some parameter is aliased

I found that an aliased parameter will throw an error while residual/surrogate calculation. I tried:

library(ordinal)
data(wine)

fmm1 <- clm(rating ~ 1 + judge + temp, data = wine)
fmm2 <- clm(rating ~ 1 + judge + bottle, data = wine)
fmm3 <- clm(rating ~ 1 + judge + temp + bottle, data = wine)
fmm4 <- clm(rating ~ 1 + judge + temp + contact, data = wine)

sure::resids(fmm3)
sure::surrogate(fmm3)

Only fmm3 produces the error:
Error in X[, -1L, drop = FALSE] %*% object$beta : non-conformable arguments

I could bypass the problem as following: I changed the code in generate_residuals from
drop(X[, -1L, drop = FALSE] %*% object$beta - object$alpha[1L])
to
drop(X[, -1L, drop = FALSE] %*% object$beta[!is.na(object$beta)] - object$alpha[1L])

This way I get residual values. I don't know if this is legit or if it's better to throw an error saying 'there was one unidentified parameter (aliased) so residuals are not valid'. Can we use the residuals even if we drop this value?

As far as I understood your code you intended to just drop the value in
if(sum(object$aliased$beta) > 0) { X <- X[, !c(FALSE, object$aliased$beta), drop = FALSE] }

Sincerely Simon

koalaverse / sure Goto Github PK

sure's People

Contributors

Stargazers

Watchers

Forkers

sure's Issues

Error in resids.default(model) :

model should be of class "clm", "glm", "lrm", "orm", "polr", "vgam", or "vglm".

Recommend Projects

Recommend Topics

Recommend Org