Coder Social home page Coder Social logo

koalaverse / sure Goto Github PK

View Code? Open in Web Editor NEW
9.0 6.0 2.0 10.33 MB

Surrogate residuals for cumulative link and general regression models in R

Home Page: https://koalaverse.github.io/sure/index.html

R 100.00%
residuals diagnostics categorical-data ordinal-regression

sure's People

Contributors

bgreenwell avatar bradleyboehmke avatar brandongreenwell-8451 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

ericagoto ugurdar

sure's Issues

Plotting functions

Bring in @Andy-McCarthy's plotting functions. Following base R and ggplot2 convention, these should look similar to the following:

library(MASS)
fit <- polr(y ~ x, data = d, method = "logistic")
plot(fit, which = 1:2)  # base plotting method
autoplot(fit, which = "qq")  # ggplot2 method

Consider adding a new Fitted() generic

Idea is to make it easier for users to construct their own fitted-vs-residual plots. It's capitalizes to it does not interfere with R's built-in generic and the specific packages that have methods defined.

For example:

Fitted.polr <- function(object) {
  object$lp - object$zeta[1L]  # Xb - a1
}

Allow more than one plot

Something like

autoplot(fit, what = c("qq", "fitted"), ncol = 2)

The easiest way seems to be storing the plots in a list and then passing them to the grobs argument in grid.arrange.

plist <- list(p1, p2)
grid.arrange(grobs = plist, ncol = ncol)

New data sets

Add new data sets that follow examples 5-6 in Dungang and Zhang (2017).

Extracting mean response from "polr" objects

Now using:

#' @keywords internal
getMeanResponse.polr <- function(object) {
  # qfun <- getQuantileFunction(object)
  # fv <- object$fitted.values[, 1L, drop = TRUE]
  # fv[fv == 0] <- 1e-05
  # -qfun(fv)
  object$lp - object$zeta[1L]
}

Add tests

Will need to add extensive tests: testthat for typical unit tests, and longer tests stored in a slowtests directory that checks the package against the examples from the JASA paper.

Add Travis CI support

@bradleyboehmke are you familiar with Travis CI? I can write the yaml file to include, but you'll need to login with the AFIT-R GitHub account and flip the switch for the packages you want supported (ideally all of them). This will make sure that the packages are re-built and checked every time we push changes and make sure nothing is broken. We should also do the same with codecov.

Support for Ordinal models with with BRMS

Would it be possible to add support for ordinal models fit with the bayesian BRMS package:

# List required packages 
Pkgs <- c("tidyverse", "ordinal", "brms",  "rstan")

# Load packages ====
lapply(Pkgs, require, c = T)

# Load data
data <- ordinal::wine 

# Model formula
fmla <- bf(Rating ~ temp + contact +temp:contact + 1|judge) ) 

# Compile and fit model
mod <- brm(fmla,
            data = data,
            family = cumulative("probit"),
            inits = 0,
            iter = 2000,
            warmup = 1000,
            chains = 2,
            cores = 2,
            sample_prior = TRUE,
            save_all_pars = TRUE)
        )

Fix autoplot

autoplot is not supporting object types. Need to change auto.default to individual object methods (i.e. autoplot.polr) or find another, more efficient approach.

Multinomial fits

Jittering (at least on the response scale) does show some promise for multinomial fits:

library(nnet)
data(df1, package = "sure")
fit <- multinom(y ~ x, data = df1)
# fit <- multinom(y ~ x + I(x ^ 2), data = df1)
y <- as.integer(df1$y)
s <- runif(length(y), min = y, max = y + 1)
prob <- fit$fitted.values
j <- seq_len(ncol(prob))  # j = 1, 2, ..., J
jmat <- matrix(rep(j, times = nrow(prob)), ncol = ncol(prob),
               byrow = TRUE)  # 1, 2, ..., J; 1, 2, ..., J; ...
es <- rowSums((jmat + 0.5) * prob)
r <- s - es
scatter.smooth(df1$x, r, lpars = list(lwd = 2, col = "red"))

Missing values

Should make use of the naresid method. Something like:

if(!is.null(object$na.action)) {
  res <- naresid(object$na.action, res)
}

Coefficient of determination?

Seems that the surrogate response values can be used to develop a coefficient of determinations (i.e., R-squared) type statistic!

Incorporate jittering technique

Will need to learn all the properties of these residuals. The residual-by-covariate plots don't look right for the quadratic example.

Faster bootstrap code

Will need to use Rcpp or vectorize the truncated distribution functions; the latter might be the simpler approach.

Code cleanup!

For example, getSurrogate should be called within getResiduals, rather than both duplicating the same code.

Broader link function support

Add support for logistic, probit, loglog, cloglog, and cauchit link functions. Will need to import gumbel distribution functions from MASS, ordinal or VGAM.

Add broom support

Write methods (i.e., "clm", "polr", and "vglm") for brooms generic augment function to add residuals to original data.

is it possible to apply "sure " R package to ordinal models "svyolr"?

Dear prof. Greenwell,

I am trying to use “sure” package to make diagnostics on a logit ordinal model generated with “svyolr” function (“survey” R package”), in particular i am trying to use “resids” function, but i get the error “Error in eval(predvars, data, env) : object "MR" not found”, where “MR” is the response variable (ordered factor) of the svyolr model.

I tryed to assign the class “polr” to the svyolr model (that is based on polr function) but the error persists.

I apologize if the question is trivial but I can’t get rid of the error.

Yours sincerely,

Please find below an example.

Fabio Maria Colombo


library(sure)
library(polr)
library(survey)

#Simulated dataset
RM=ordered(c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
3, 3, 3, 3, 4, 4, 4, 3, 3, 4, 4, 4, 4))
data=data.frame(age=rnorm(mean=7,sd=2,49),
weight=rnorm(mean=10,sd=3,49),
nose.length=rnorm(49,mean=3,sd=1.5),
head.length=rnorm(49,mean=8,sd=2),
w=abs(1+trunc(rnorm(mean=2, sd=2,49))))
#survey design
svydesign<-svydesign(id=~1,data=data,weights=~w)

#svyolr model
model=svyolr(RM~nose.length+head.length,
design=svydesign
)
#resids(sure) function
resids(model)

Error in resids.default(model) :

model should be of class "clm", "glm", "lrm", "orm", "polr", "vgam", or "vglm".

class(model)='polr' #assigns the class "polr" to model
resids(model)
#Error in eval(predvars, data, env) : object "nose.length" not found
####################################################################

Binary GLMs when response is not a factor

heart <- data.frame(
  ck = 0:11*40 + 20,
  ha = c(2, 13, 30, 30, 21, 19, 18, 13, 19, 15, 7, 8),
  ok = c(88, 26, 8, 5, 0, 1, 1, 1, 1, 0, 0, 0)
)
mod.0 <- glm(cbind(ha,ok) ~ ck, family = binomial(link = "logit"),
             data = heart)
sure::resids(mod.0)

Residual generation fails when some parameter is aliased

I found that an aliased parameter will throw an error while residual/surrogate calculation. I tried:

library(ordinal)
data(wine)

fmm1 <- clm(rating ~ 1 + judge + temp, data = wine)
fmm2 <- clm(rating ~ 1 + judge + bottle, data = wine)
fmm3 <- clm(rating ~ 1 + judge + temp + bottle, data = wine)
fmm4 <- clm(rating ~ 1 + judge + temp + contact, data = wine)

sure::resids(fmm3)
sure::surrogate(fmm3)

Only fmm3 produces the error:
Error in X[, -1L, drop = FALSE] %*% object$beta : non-conformable arguments

I could bypass the problem as following: I changed the code in generate_residuals from
drop(X[, -1L, drop = FALSE] %*% object$beta - object$alpha[1L])
to
drop(X[, -1L, drop = FALSE] %*% object$beta[!is.na(object$beta)] - object$alpha[1L])

This way I get residual values. I don't know if this is legit or if it's better to throw an error saying 'there was one unidentified parameter (aliased) so residuals are not valid'. Can we use the residuals even if we drop this value?

As far as I understood your code you intended to just drop the value in
if(sum(object$aliased$beta) > 0) { X <- X[, !c(FALSE, object$aliased$beta), drop = FALSE] }

Sincerely Simon

Jittering on the probability scale

Jittering on the response scale (jitter.scale = "response")seems to work fine, but seems problematic on the probability scale (jitter.scale = "probability").

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.