Coder Social home page Coder Social logo

Comments (3)

HunterMcGushion avatar HunterMcGushion commented on September 21, 2024 1

@chwang1991,

Thanks for opening this issue! Let me start by saying that your questions are definitely not “stupid”! These are fantastic questions, and I’m sure many others have them as well. So thank you for bringing them to my attention, because it means that I probably need to improve the documentation.

Regarding RandomForestOptPro and GradientBoostedRegressionTreeOptPro, you may already be clear on this, but it helps me to remember that they’re just normal ML algorithms. Hyperparameter optimization is just ML on top of ML. These two algorithms (I call them “OptPros” or “Optimization Protocols” because they’re designed for hyperparameter optimization) are built on top of SKLearn’s RandomForestRegressor and GradientBoostingRegressor, respectively. Since you’re familiar with Bayesian optimization, it could help to remember that our BayesianOptPro is similarly built on top of SKLearn’s GaussianProcessRegressor.

Each of the Optimization Protocols is just a wrapper around some existing ML algorithm to enable us to use it for hyperparameter optimization. So in order to understand the two OptPros you mentioned, I would recommend reading the documentation for the above-mentioned SKLearn classes that they use internally: RandomForestRegressor and GradientBoostingRegressor. Of course, these classes are the base_estimators for each OptPro, and we also make use of things like acquisition functions in order to estimate the utility of each proposed set of hyperparameters, but this is also the case in standard Bayesian optimization.

In the end, the most significant difference between the BayesianOptPro with which you’re familiar and RandomForestOptPro, for example, is that BayesianOptPro uses GaussianProcessRegressor as its base_estimator, whereas RandomForestOptPro uses RandomForestRegressor.

I know my response hasn’t really touched on the technical differences between the different OptPros, but like I said, to get a better idea of the specific differences in behaviors, you can see the documentation of the base_estimator classes. For HH, it makes sense to offer a wider range of OptPros because it’s so easy to switch between them due to the fact that HH automatically records and reuses Experiments during Optimization. Sometimes you may find that BayesianOptPro simply isn’t working as well as you want. With HH, you can simply stop your current OptPro and pick up learning with a different OptPro to see how things go—without losing everything your previous OptPro learned. In my problems, I find it often helps to get the different perspective of another OptPro. In fact, I’ll often switch between all of them, giving each 10 iterations, then moving on to the next OptPro. I don’t know of any other library that enables this sort of diverse exploration of the problem space while still retaining all of the data collected by previous optimization rounds.

Turning to your question about finding the best Experiment, you are correct that the key is using the leaderboard file. This could definitely be better documented, so I apologize. What you’ll want to do is check the Leaderboard for the experiment_id corresponding to the score you want. Then go to the “Experiments/Descriptions” directory. Each of the JSON files here are named after an experiment_id in the Leaderboard. So open up the JSON file for the Experiment you want, and you’ll find all of the important information about the Experiment, including scores, execution times, algorithm used, and of course, hyperparameters and feature engineering steps used. Under the “hyperparameters” key you can find all the hyperparameters used for the Experiment—even the ones you didn’t explicitly declare—so you can thoroughly recreate the Experiment.

Both of these topics could definitely be documented better, so thank you very much for asking these fantastic questions! Please let me know if you have any further questions or feedback, and thank you for your support! I really appreciate it!

from hyperparameter_hunter.

chwang1991 avatar chwang1991 commented on September 21, 2024 1

I was surprised that today there are still open-sourcers willing to explain so in detail like you....

You not only solved my question but also help extend my knowledge -- you are right, hyper-tuning techs are just ML on top of ML! Thank you so much!

from hyperparameter_hunter.

chwang1991 avatar chwang1991 commented on September 21, 2024

OK I see there is a "hyperparameter_key" column in the leaderboard file, so can I use the key to directly get the hyperparameter combo?

from hyperparameter_hunter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.