Coder Social home page Coder Social logo

Comments (5)

GLmontanari avatar GLmontanari commented on June 7, 2024 1

Oh, wow! That's a bug for sure. I wasn't sure where to put num_classes = 2 (which is my case) and I put it both in the constructor and in the call of HyperParameterTuner.Optimize() because the docs says to put all the parameters in the same order as in the constructor of the classifier, but it's clearly a mistake, because as you say, the number of classes is no hyperparameter in this case.
I commented it and now I get different results on my dataset. Thank you!!!

from mlpack.

rcurtin avatar rcurtin commented on June 7, 2024

Sorry for the slow response on this---I was traveling. Are you able to try with the fixes in #3521? Alternately, can you provide a dataset that I can try your code with to see if those fixes work?

My theory is that the random forest you are training is affected by the same bug that #3521 solves---that bug basically means that when using OpenMP, the trees created by each thread might be identical (and thus perform exactly the same).

Another way you could test here is by disabling OpenMP during compilation.

Thanks for the report!

from mlpack.

conradsnicta avatar conradsnicta commented on June 7, 2024

@GLmontanari #3521 was merged. Let us know if this addresses the issue on your setup.

from mlpack.

GLmontanari avatar GLmontanari commented on June 7, 2024

Dear all,

I don't have OpenMP, and I compiled the library with that option disabled. Nevertheless I forked the last commits and tried again.
Now I have armadillo 12.6.2 and mlpack, well, the last version. Nothing changes.
I can share my data if you wish. How do you rather have it?

from mlpack.

rcurtin avatar rcurtin commented on June 7, 2024

Thanks for trying anyway. I tried the code with the covertype dataset (7 classes, ~580k points, 55 dimensions). I found that there is an error in the hpt.Optimize() call:

	std::tie(best_forest) = hpt.Optimize(
		mlpack::Fixed(numClasses), 
		forest_sz, // forest_sz 
		mlpack::Fixed(1),  // min_leaf_sz
		mlpack::Fixed(1e-7),	   // gain_split
		mlpack::Fixed(0)); // max_depth

You don't need to specify the number of classes here, since that is not a hyperparameter. In fact, what that line actually does is performs hyperparameter optimization with the number of trees set to the number of classes, the minimum leaf size set to forest_sz, and so forth... the arguments are off by one. When I removed numClasses from that line, I had more success:

	std::tie(best_forest) = hpt.Optimize(
		forest_sz, // forest_sz 
		mlpack::Fixed(1),  // min_leaf_sz
		mlpack::Fixed(1e-7),	   // gain_split
		mlpack::Fixed(0)); // max_depth

I think that will fix your issue. Let me know if you still see strange behavior and I can keep digging.

from mlpack.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.