Coder Social home page Coder Social logo

Comments (8)

harrelfe avatar harrelfe commented on July 18, 2024

This is supposed to work. Either it is a bug or you haven't updated to the latest versions of Hmisc and rms on CRAN. Please provide the version numbers you are using and a minimal self-contained reproducible example.

from rms.

tormodb avatar tormodb commented on July 18, 2024

Dear Prof. Harrel, thanks for your quick reply.

I am currently using version 4.4-1 of rms and version 3.17-1 of Hmisc which are the latest versions according to CRAN. I am having some troubles providing the reproducible example as I am not sure how to make a sample of the (unpublished) data I am using from the mids object created by mice for the imputed data.

from rms.

harrelfe avatar harrelfe commented on July 18, 2024

Don't take a sample. Generate simulated data using expand.grid, data.frame, rnorm, runif, etc. after set.seed(1) if using random numbers.

from rms.

tormodb avatar tormodb commented on July 18, 2024

Dear Prof. Harrell.

I am not a very skilled programmer, simply a psychologist-researcher, so I have had some trouble generating simulated data. What I have done, however, is to upgrade all packages (and Rstudio), and then manually downgraded rms to package version 4.3-1 (installed from source from the CRAN archive). Then the fit.mult.impute function works again as expected.

I also inspected the changelog of the update to rms version 4.4.1 and noticed this:

"bj, cph, Glm, lrm, ols, orm: changed to subset model.matrix result on mmcolnames to rigorously require expected design matrix column names to be what model.matrix actually constructed"

which (from my absolutely-non-programming background) appear to perhaps relate to the mmcolnames error message I got from version 4.4-1.

Again, I apologize for not being able to provide you with simulated data, but at least downgrading the rms package seems to provide a workaround for me at this time.

Thanks for your patience.

from rms.

thaoz avatar thaoz commented on July 18, 2024

I have just sent you an email about similar issue. If I take out factor variable, the function works just fine.
For example, please take a look at these code:

str(nhanes2)
'data.frame': 25 obs. of 4 variables:
$ age: Factor w/ 3 levels "20-39","40-59",..: 1 2 1 3 1 3 1 1 2 2 ...
$ bmi: num NA 22.7 NA NA 20.4 NA 22.5 30.1 22 NA ...
$ hyp: Factor w/ 2 levels "no","yes": NA 1 1 NA 1 NA 1 1 1 NA ...
$ chl: num NA 187 187 NA 113 184 118 187 238 NA ...
set.seed(1)
imp <- mice(nhanes2)
lm(bmihyp + age +chl, data = nhanes2) # no error
ols(bmi
hyp + age +chl, data = nhanes2) # no error
ols(bmi~hyp + age +chl, data = complete(imp, 1))
Error in X[, c("(Intercept)", mmcolnames), drop = FALSE] :
subscript out of bounds

However if I excluded factor variables out of the formula, this works

ols(bmi~chl, data = complete(imp, 1))

from rms.

harrelfe avatar harrelfe commented on July 18, 2024

tormodb: Fixing by downgrading doesn't help me fix the problem but it does show that there is a bug.

thaoz: The complete function in mice adds unnecessary contrast attributes to the factor variables. If you remove those attributes it works. But note that aregImpute may work better in many cases. aregImpute will not work for such a tiny dataset as nhanes2 though.

Here is a reproducible example that may serve as a model for how to simulate data to help debug problems:

require(rms)
require(mice)
set.seed(1)
n <- 50
d <- data.frame(x1=runif(n), x2=sample(c('a','b','c'), n, TRUE),
                x3=sample(c('A','B','C','D'), n, TRUE),
                x4=sample(0:1, n, TRUE),
                y=runif(n))
d$x1[1:5]  <- NA
d$x2[3:9]  <- NA
d$x3[7:14] <- NA

a <- aregImpute(~ x1 + x2 + x3 + x4 + y, data=d)
ols(y ~ x1 + x2 + x3 + x4, data=d)

fit.mult.impute(y ~ x1 + x2 + x3 + x4, ols, a, data=d)  # works

m <- mice(d)
d1 <- complete(m, 1)
ols(y ~ x1 + x2 + x3 + x4, data=d1)

w <- d1
attr(w$x2, 'contrasts') <- NULL
attr(w$x3, 'contrasts') <- NULL
ols(y ~ x1 + x2 + x3 + x4, data=w)

from rms.

tormodb avatar tormodb commented on July 18, 2024

Dear Prof. Harrel, thanks to your example I have been able to make simulated data that reproduces the error.

require(rms)
require(mice)

n <- 50 # This is more than 10 000 in the actual dataset
d <- data.frame(age=sample(16:19, n, TRUE), 
                ethn=sample(c('no','eu','eaa'), n, TRUE),
                eco=sample(c('poor','equal','better'), n, TRUE),
                struc=sample(c('single','two'), n, TRUE),
                edu=sample(c('basic','intermediate','higher', 'uknown'), n, TRUE),
                work=sample(c('work','benefits','work/benefits'), n, TRUE),
                school=sample(c('vocational','general'), n, TRUE),
                physact=sample(c('no','yes'), n, TRUE))

d$ethn <- as.factor(d$ethn)
d$eco <- as.factor(d$eco)
d$struc <- as.factor(d$struc)
d$edu <- as.factor(d$edu)
d$work <- as.factor(d$work)
d$school <- as.factor(d$school)
d$physact <- as.factor(d$physact)

d$ethn[1:7]  <- NA
d$edu[3:9]  <- NA
d$school[7:14] <- NA
d$physact[15:20] <- NA

dd <- datadist(d)
options(datadist = "dd")

a <- mice(d)

fit.mult.impute(physact ~ age + ethn + eco + struc + edu + work + school, lrm, a, d)

Produces the error:

Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds

in rms version 4.4-1, but runs fine in version 4.3-1.

(I used packrat to run the old version of rms in a separate project, and updated rms to the most recent version outside of that project. Just letting you know in case that could make a difference).

from rms.

harrelfe avatar harrelfe commented on July 18, 2024

The little simulation I produced above demonstrates the point if you pass the result of mice() into fit.mult.impute. The problem is an error in how mice::complete() adds a contrasts attribute. In the next release of the Hmisc package I'll have fit.mult.impute take away this attribute. In the meantime you can use aregImpute or access the updated source file in Github for the Hmisc project which I'll have fixed today. The source file is transcan.s and you can source('https://raw.githubusercontent.com/harrelfe/Hmisc/master/R/transcan.s') after typing library(Hmisc) or require(Hmisc) to override fit.mult.impute to the new version once you see from https://github.com/harrelfe/Hmisc/blob/master/R/transcan.s that it is updated.

from rms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.