Coder Social home page Coder Social logo

Comments (25)

harrelfe avatar harrelfe commented on August 17, 2024

Produce a minimal self-contained working example and I'll debug.


Frank E Harrell Jr Professor and Chairman School of Medicine

Department of Biostatistics Vanderbilt University

On Fri, Jul 22, 2016 at 10:11 AM, Lucas Sala [email protected]
wrote:

Dear Prof. Harrel,

I'm trying to fit a logistic regression model using lrm() function in a
dataset with 47 factors and 1 integer (the target variable) and I'm getting
the following error:

str(dataset)
'data.frame': 147166 obs. of 48 variables:
$ X1 : Factor w/ 7 levels "01",..: 3 5 3 4 1 2 7 2 5 6 ...
$ X2 : Factor w/ 2 levels "01",..: 2 2 2 2 1 1 2 1 2 2 ...
$ X3 : Factor w/ 11 levels "11:",..: 2 2 2 8 2 2 2 2 2 2 ...
$ X4 : Factor w/ 8 levels "08: QTDE_DIAS_DIST_M0 > 27",..: 2 2 2 3 2 2 2 2
2 2 ...
.
.
.
$ TARGET : int 1 1 1 1 1 1 1 1 1 1 ...
require(rms)
fit <- lrm(TARGET ~ ., data = dataset, x=TRUE, y=TRUE, se.fit=TRUE)
Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds

Is there any clues on what's going on?

Thanks in advance,
Lucas


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#29, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABGO2oyz3A5_9WxFR6QL8l4MM_OojkBMks5qYN2tgaJpZM4JS2iw
.

from rms.

lucasevsala avatar lucasevsala commented on August 17, 2024

Dear Prof. Harrel,
Here is a self-contained working example:

Load rms package

require(rms)

Generate simulated data

dataset <- data.frame(
X1 = factor(c('05: X1 <= 178','01: X1 <= 6', '03: X1 <= 52', '05: X1 <= 178')),
X2 = factor(c('04: X2 <= 75','01: X2 <= 6', '05: X2 > 75', '05: X2 > 75')),
X3 = factor(c('04: X3 <= 552','01: X3 <= 1', '04: X3 <= 552', '06: X3 > 1313')),
TARGET = c(0, 1, 1, 0)
)

Fit logistic regression model

fit <- lrm(TARGET ~ ., data = dataset, x=TRUE, y=TRUE, se.fit=TRUE)

When executed, I get the same error as before:

Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds

Thanks in advance,
Lucas Sala

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

rms does not support models like ~ . because that would imply that all the continuous variables act linearly, which would be unusual.

from rms.

kuiperjh avatar kuiperjh commented on August 17, 2024

Dear Prof. Harrel,

Apologies, I am not entirely sure what the situation around this problem is at the moment. Like Lucas, I love to use rms for regression models, and I also have problems using lrm. It simply does not seem to work with factors, regardless whether they are factors in the dataframe or based on as.factor() or catg() in the regression analysis.

A simple example (based on Lucas'):

dataset <- data.frame(
X1 = factor(c('C','P', 'P', 'P')),TARGET = c(0, 1, 1, 0)
)
fit<-lrm(TARGETX1,data=dataset)
Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds
dataset$X2 <- ifelse(dataset$X1=="P",1,0)
fit<-lrm(TARGET
X2,data=dataset) #No problem running
fit<-lrm(TARGET~catg(X2),data=dataset)
Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds

Of course, with only two levels in the factor the problem can be analysed by converting the factor to a numeric, but a non-ordered factor with more than two levels does seem to give a problem...

With kind regards, Jan Herman Kuiper

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

I think I addressed this bug earlier and it's fixed now. A new release will be in CRAN in a couple of weeks.

from rms.

kuiperjh avatar kuiperjh commented on August 17, 2024

Thank you for the quick response! Yes, I found a few references online to this problem/bug, but all to do with using as.factor() in the formula. Glad it will be resolved, and thank you very much for developing the package in the first place.

from rms.

Nuno9 avatar Nuno9 commented on August 17, 2024

Thank you for developing the package. I am having a similar problem and I worked around it by using strat(x1) everytime I would use factor(x1). I use it for 2, 3 or 4 categories and it seems to work. Am I correct in assuming it's working as a factor?
Kind regards,
Nuno

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

No, strat is for something else. I don't know of a workaround until the new version comes out. If you have linux I can send you the tarballs to install now.

from rms.

Nuno9 avatar Nuno9 commented on August 17, 2024

Thank you clarifying Prof. Harrel. I'm using Windows so I'll just wait for the update.
Many Thanks,
Nuno

from rms.

kuiperjh avatar kuiperjh commented on August 17, 2024

Dear Prof. Harrel,

Today I ran into very similar problems while using a factor, this time in your function bj(). The problem seems to lie in the column names of the matrix X. Many of your rms functions use a design matrix X, for which you collect the column name attributes
X <- Design(eval.parent(m))
atrx <- attributes(X)
mmcolnames <- atr$mmcolnames
I have a factor called "Strands", with the labels "Four", Six" and "Eight". This gives mmcolnames of "StrandsSix" and "StrandsEight". Later, X is converted into a design matrix
X <- model.matrix(sformula, X)
This changes the column names of factors, in my case "Strands[T.Six]" and "Strands[T.Eight]". You seem to have anticipated this, because you try to retrieve an alternative form of mmcolnames and then check if mmcolnames matches the column names in the model matrix X
alt <- attr(mmcolnames, 'alt')
if(! all(mmcolnames %in% colnames(X)) && length(alt))
mmcolnames <- alt
However, in none of my problems mmcolnames gets any attribute (let alone "alt"), so the call to attr returns a [NULL]. Thus, although "all(mmcolnames %in% colnames(X))" returns a FALSE, no alternative form is used. The next statement is where Lucas and I ran into trouble:
X <- X[, mmcolnames, drop=FALSE]
Because mmcolnames is not updated this statement returns an error because the actual column names in X do not match those in mmcolnames. This only occurs with factors because numerical columns keep their original name after a call to design.matrix. Simply using
X <- X[, -1, drop=FALSE]
solved the problem, but I guess the problem may lie in providing attributes to mmcolnames?

with kind regards, Jan Herman

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

Please post a minimal working example requiring no external data

from rms.

kuiperjh avatar kuiperjh commented on August 17, 2024

Thank you for your quick reply, and my apologies. Here it goes - this is the same example as Lucas' above - together with the output I got from the browser. I used lrm() instead of bj() - the underlying issue seems the same.

dataset <- data.frame(
 X1 = factor(c('05: X1 <= 178','01: X1 <= 6', '03: X1 <= 52', '05: X1 <= 178')),
 X2 = factor(c('04: X2 <= 75','01: X2 <= 6', '05: X2 > 75', '05: X2 > 75')),
 X3 = factor(c('04: X3 <= 552','01: X3 <= 1', '04: X3 <= 552', '06: X3 > 1313')),
 TARGET = c(0, 1, 1, 0)
 )
fit <- lrm(TARGET ~ ., data = dataset, x=TRUE, y=TRUE, se.fit=TRUE)

This gives the above error message:

Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds

When I then browse the contents of the variables, the issue becomes clear:

Enter a frame number, or 0 to exit   

1: lrm(TARGET ~ ., data = dataset, x = TRUE, y = TRUE, se.fit = TRUE)

Selection: 1
Called from: top level 
Browse[1]> mmcolnames
[1] "X103: X1 < 52"   "X105: X1 < 178"  "X204: X2 < 75"   "X205: X2 > 75"   "X304: X3 < 552"  "X306: X3 > 1313"
Browse[1]> attributes(mmcolnames)
NULL
Browse[1]> colnames(X)
[1] "(Intercept)"         "X1[T.03: X1 <= 52]"  "X1[T.05: X1 <= 178]" "X2[T.04: X2 <= 75]"  "X2[T.05: X2 > 75]"   "X3[T.04: X3 <= 552]" "X3[T.06: X3 > 1313]"

Therefore, the call X[, mmcolnames, drop = FALSE] will fail, because the column names do not match. This is no problem with numerical covariates, because their column name does not change after the call to model.matrix() in the functions just above the line that gives the error: X <- model.matrix(Terms.ns, X) in lrm() and X <- model.matrix(sformula, X) in bj().

With kind regards, Jan Herman

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

Jan that was very helpful in my finding the bug. I was not escaping <= in levels as I had escaped >=. The fix is now committed and will be in the next release to CRAN. Thank you.

from rms.

kuiperjh avatar kuiperjh commented on August 17, 2024

from rms.

jmmax avatar jmmax commented on August 17, 2024

Hello, when will the next release to cran be? I'm using version 5.1.1 and getting this error. Could you possibly send me the tarballs? Thanks!

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

I expect updates on CRAN around 2017-09-22. Current tarballs are found here:
http://data.vanderbilt.edu/fh/attach/Hmisc_4.0-4.tar.gz
http://data.vanderbilt.edu/fh/attach/rms_5.1-2.tar.gz

Best to update Hmisc before updating rms

from rms.

jmmax avatar jmmax commented on August 17, 2024

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

rms does not support ~ .. Try repeating this with specifics on the right hand side of the model formula.

from rms.

jmmax avatar jmmax commented on August 17, 2024

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

Thanks. I'll debug further.

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

I got farther than you with the new tarballs, even with the ~. notation. Then I ran into a problem just due to your small sample size. I changed the test program to the following and everything worked fine:

d <- expand.grid(
X1 = factor(c('05: X1 <= 178','01: X1 <= 6', '03: X1 <= 52', '05: X1 <= 178')),
X2 = factor(c('04: X2 <= 75','01: X2 <= 6', '05: X2 > 75', '05: X2 > 75')),
X3 = factor(c('04: X3 <= 552','01: X3 <= 1', '04: X3 <= 552', '06: X3 > 1313')),
rep = 1 : 100)
set.seed(1)
d$TARGET <- sample(0 : 1, nrow(d), replace=TRUE)

lrm(TARGET ~ ., data = d)

from rms.

mbbrigitte avatar mbbrigitte commented on August 17, 2024

I installed the new version that you posted as well (Himsc_4.0-4 and rms_5.1-2, running on R 3.4.0) and still get the same error with the code that you just posted above:
Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds

from rms.

alialvi92 avatar alialvi92 commented on August 17, 2024

Hello, I am still getting the error "Error in X[, mmcolnames, drop = FALSE] : subscript out of bounds" in rms package. I was NOT getting this error previously and have only recently started getting it.
My model has factor variables with 2 and more than 2 values. If i remove the variable with > 2 values, the model works. I have provided a reproducible example below.

> Data <- data.frame(
  X = sample(1:700),
  Y = sample(c("yes", "no"),700, replace = TRUE),
  Z = sample (c("Back pain", "Leg Pain", "Back pain = Leg pain"),700, replace = TRUE)
)
> fit<- lrm(Y~X+Z, data= Data)

Some more information:

> packageVersion("rms")
[1] ‘5.1.2’

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS  10.13.3

from rms.

harrelfe avatar harrelfe commented on August 17, 2024

I had a bug when there was an = sign inside a value label. Fix committed and will be in next release to CRAN.

from rms.

babasaraki avatar babasaraki commented on August 17, 2024

Hi Professor Harrelfe,
I run rms package in R to plot nomogram in order to predict birth weight from TC and HB but unfortunately it keeps saying Unable to fit model using “lrm.fit” and I am sure what went wrong. Here is the full steps of the code that I following based on the rms documentation.

`n <- 1203
set.seed(17)
d <- data.frame(age = rnorm(n, 28, 7),
hand.breath = rnorm(n, 3.44, 0.34),
thigh.circumference = rnorm(n, 15, 2),
sex = factor(sample(c('female','male'), n,TRUE)))

d <- upData(d,
L = .4*(sex=='male') + .045*(age-28) +
(log(hand.breath - 2)-5.2)(-2(sex=='female') + 2*(sex=='male')),
y = ifelse(runif(n) < plogis(L), 1, 0))

ddist <- datadist(d); options(datadist='ddist')

f <- lrm(y ~ lsp(age,28) + sex * rcs(hand.breath, 4) + thigh.circumference,
data=d)
nom <- nomogram(f, fun=function(x)1/(1+exp(-x)), # or fun=plogis
fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999),
funlabel="Birth Weight")

plot(nom, xfrac=.45)
print(nom)
nom <- nomogram(f, age=seq(10,90,by=10))
plot(nom, xfrac=.45)
g <- lrm(y ~ sex + rcs(age, 3) * rcs(hand.breath, 3), data=d)
nom <- nomogram(g, interact=list(age=c(5,10,15,20)),
conf.int=c(.7,.9,.95))
plot(nom, col.conf=c(1,.5,.2), naxes=7)`

I am stock at this code f <- lrm(y ~ lsp(age,28) + sex * rcs(hand.breath, 4) + thigh.circumference, data=d)
Please kindly help me out to complete and get the nomogram plotted.

Thank you so much.

from rms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.