Comments (5)
Oh, wow! That's a bug for sure. I wasn't sure where to put num_classes = 2
(which is my case) and I put it both in the constructor and in the call of HyperParameterTuner.Optimize()
because the docs says to put all the parameters in the same order as in the constructor of the classifier, but it's clearly a mistake, because as you say, the number of classes is no hyperparameter in this case.
I commented it and now I get different results on my dataset. Thank you!!!
from mlpack.
Sorry for the slow response on this---I was traveling. Are you able to try with the fixes in #3521? Alternately, can you provide a dataset that I can try your code with to see if those fixes work?
My theory is that the random forest you are training is affected by the same bug that #3521 solves---that bug basically means that when using OpenMP, the trees created by each thread might be identical (and thus perform exactly the same).
Another way you could test here is by disabling OpenMP during compilation.
Thanks for the report!
from mlpack.
@GLmontanari #3521 was merged. Let us know if this addresses the issue on your setup.
from mlpack.
Dear all,
I don't have OpenMP, and I compiled the library with that option disabled. Nevertheless I forked the last commits and tried again.
Now I have armadillo 12.6.2 and mlpack, well, the last version. Nothing changes.
I can share my data if you wish. How do you rather have it?
from mlpack.
Thanks for trying anyway. I tried the code with the covertype dataset (7 classes, ~580k points, 55 dimensions). I found that there is an error in the hpt.Optimize()
call:
std::tie(best_forest) = hpt.Optimize(
mlpack::Fixed(numClasses),
forest_sz, // forest_sz
mlpack::Fixed(1), // min_leaf_sz
mlpack::Fixed(1e-7), // gain_split
mlpack::Fixed(0)); // max_depth
You don't need to specify the number of classes here, since that is not a hyperparameter. In fact, what that line actually does is performs hyperparameter optimization with the number of trees set to the number of classes, the minimum leaf size set to forest_sz
, and so forth... the arguments are off by one. When I removed numClasses
from that line, I had more success:
std::tie(best_forest) = hpt.Optimize(
forest_sz, // forest_sz
mlpack::Fixed(1), // min_leaf_sz
mlpack::Fixed(1e-7), // gain_split
mlpack::Fixed(0)); // max_depth
I think that will fix your issue. Let me know if you still see strange behavior and I can keep digging.
from mlpack.
Related Issues (20)
- NVP mentioned when loading models from Cereal XML file, but not mentioned when model is saved HOT 3
- BLAS, LAPACK dependencies not installed when using DOWNLOAD_DEPENDENCIES flag HOT 1
- Example of input data format for HMMs HOT 6
- Linker errors when building mlpack from source - Linux HOT 4
- Error bulding golang binding on MacOS M2 HOT 2
- Error while installing mlpack
- Circular includes lead to compilation errors HOT 5
- For ANN (c++) Remove Templates for OutputLayerType and InitializationRuleType HOT 7
- Can you include `mlpack.hpp` into the python wheel? HOT 6
- Security improvement: get_deps with https urls? HOT 10
- MLPACK_STRING_VIEW error HOT 13
- Inconsistent use of the "input" parameter to the Backward method in ANNs HOT 11
- Add accuracy measure for mlpack_logistic_regression ? HOT 7
- Error in use of preprocess_split() in LinearRegression. HOT 2
- Proposal for New Supervised Learning Data Simulation Classes in C++ for MLPACK Library HOT 9
- Build fail for project using MLPack, M_SQRT2 undeclared identifier HOT 4
- MPLACK failed build in x86 mode HOT 1
- msvc round in core/math/round.hpp HOT 2
- Multihead attention layer HOT 1
- Multi-head output network HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlpack.