Comments (6)
This is probably due to the formula interface. Try to use the alternative interface with dependent.variable.name
.
from ranger.
I tried dependent.variable.name="y", it is still not working for me...
from ranger.
With the same error message? That should only occur with the formula interface. Please give a reproducible example or at least your ranger call and some information to the data.
from ranger.
I just solved the problem by converting the matrix into sparse matrix. But I also encountered a similar problem when using importance_pvalue(). It only allows data.frame/formula as inputs. Is it possible to make it also compatible with the sparse matrix?
The codes I used:
x <- data.frame(rbind(t(rmultinom(7000, 75000, c(.201,.5,.02,.18,.099))),
t(rmultinom(8000, 75000, c(.201,.4,.12,.18,.099))),
t(rmultinom(15000, 75000, c(.011,.3,.22,.18,.289))),
t(rmultinom(15000, 75000, c(.091,.2,.32,.18,.209))),
t(rmultinom(15000, 75000, c(.001,.1,.42,.18,.299)))))
y <- factor(c(rep("A", 15000), rep("B", 15000), rep("C", 15000), rep("D", 15000)))
data<-data.frame(y, x)
sparse_data <- Matrix(data.matrix(data), sparse = TRUE)
rf.model <- ranger::ranger(dependent.variable.name="y", data=sparse_data, keep.inbag=TRUE, importance='permutation')
from ranger.
The importance_pvalues()
function only needs the formula/data when using the permutation approach ("altmann"
). The idea was that this method is so slow that no one wants to run it on data that's to large for the formula interface. I will check whether this is possible with sparse data.
In the meantime, the permutation p-values are so simple, you can just do it yourself:
library(ranger)
num.permutations <- 100
# Run RF
rf <- ranger(dependent.variable.name = "Species", data = iris, importance = "permutation")
# Permute and compute importance again (be sure to use same parameters as above)
vimp <- replicate(num.permutations, {
dat <- iris
dat[, "Species"] <- dat[sample(nrow(dat)), "Species"]
ranger(dependent.variable.name = "Species", data = dat, importance = "permutation")$variable.importance
})
# Compute p-values
pval <- sapply(1:nrow(vimp), function(i) {
(sum(vimp[i, ] >= rf$variable.importance[i]) + 1)/(ncol(vimp) + 1)
})
res <- cbind(rf$variable.importance, pval)
colnames(res) <- c("importance", "pvalue")
res
(this is just copy&paste from the importance_pvalues()
function)
from ranger.
Thanks so much!
from ranger.
Related Issues (20)
- Increasing mtry crashes ranger fit HOT 3
- make fails => cannot compile C++ source on Mac HOT 2
- Error updating the package HOT 14
- warnings generated running 'Understanding random forests with randomForestExplainer' code HOT 1
- num.threads causing crashes inside caret recursive feature elimination wrapper HOT 1
- Results from importance_pvalues() differ despite setting seed HOT 1
- Decision Tree Build HOT 2
- Random forest prediction intervals using the out-of-bag predictions errors. HOT 2
- Is there a way to fit an isolation forest using ranger? HOT 1
- Node-wise impurity decrease HOT 2
- Extract "dependent.variable.name" from a ranger object HOT 6
- No Tree Plotting Function Provided by Package HOT 2
- Add C++14 specification (`std::make_unique` is only avaiable from C++14 onwards) HOT 2
- classProbs are not in line with the predicted label HOT 4
- Trees summary statistics: height, splits HOT 2
- Matrices without colnames. HOT 2
- A check on inbag size would be nice
- Feature Request: inclusion of the trivial random forest model HOT 2
- compilation failed for package 'ranger' HOT 2
- Clarify Gini index calculation HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ranger.