Coder Social home page Coder Social logo

Comments (3)

dengdifan avatar dengdifan commented on May 24, 2024

Hi @simonprovost
thanks for the information.

How are inactive hyperparameters in the dataset provided to the random forest's surrogate in SMAC managed precisely? Are they executed as described above? Following this Github issue, would you be open to a PR in the FAQ explaining how they are handled?

The main reason that we impute NaN value for RF is that our RF surrogate model is built based on the pyrfr package written in C++ and wrapped by swig.
NaN values might not be easily transferred to the corresponding C++ variants in swig. Therefore we need to impute those values.

Why is the number of options the choices offer is used to represent inactive categorical hyperparameters? What is the logic behind this decision?

In ConfigSpace, categorical HPs are encoded as Numerical Encoding ([0, 1, 2, ... n_opts - 1]), as shown in this line. Therefore, a categorical HP will never select a HP whose vector value is n_opts if it is an activate HP

Why is -1 used for inactive float/integer hyperparameters, and what effect does this decision have on the model? Is -1 not regarded as one of the options? Or, as I have observed elsewhere, float/integer hyperparameters are rescaled? If so, could you please provide further explanation for this type of inactive hyperparameters?

Similar to categorical HPs, Numerical HPs (float & int) are represented as vectors within [0,1] (This normalization method is also used in GP models). Therefore, they will never select -1 if the HPs are activate

Were there any considerations for modifying the decision tree splitting criteria to manage inactive hyperparameters based on a flag or other manners, as opposed to using placeholders to falsify the decision tree with almost to no information gain for these hyperparameters?

As soon as our surrogate models are based on pyrfr, this might not be easily implemented.

Would you confirm that following https://github.com/scikit-learn/scikit-learn/pull/23595is not going to influence with these potential inactive hyperparameters that can be actually seen as missing values ? Given your extra layer of missing value imputation, I reckon this to be not an issue, yet always great to extra confirm.

We also considering reimplementing our RF models based on SKLearn's models, however, I cannot promise the exact time that would happen.,

Is there a method to print the input data given to the surrogate using the API? In order for us to have a visual interpretation. If not, could we be directed to a good starting point for printing in the code following a fork of SMAC?

for a configuration, you can simply call config.get_array() to get its numerical representation.

Hopes that answers all your questions

from smac3.

simonprovost avatar simonprovost commented on May 24, 2024

Hi @dengdifan,

I greatly appreciate your detailed response; it has helped clarify numerous aspects. I understand that placeholder values outside the range of their active counterparts are assigned to inactive hyperparameters to prevent them from significantly influencing the surrogate model.

From the discussion, it appears that the placeholder values are not likely to be selected or to lead to meaningful splits in the decision trees of the surrogate model. E.g., Due to the uniformity and lack of correlation between these placeholders and the target values, the information gain from splitting on these values is typically low, particularly when working with a configuration-based dataset that is densely populated.

Nonetheless, I am curious about two things:

  • Conceptually speaking this is confusing to not use the same placeholder while however you encode every HPs, is there a reason to that?
  • Whether explicitly preventing splits on inactive hyperparameters could be advantageous, in terms of potential forbidden split and tree construction time efficiency. If, for example, the trees of the surrogate model could be constrained to only consider splits within the valid range of active hyperparameter values, this could prevent the improbable but possible occurrence of a less informative split based on placeholder (i.e., inactive HP) values. In addition, this may be one more reason to investigate the use of Scikit-learn's trees, as you mentioned there are considerations to reimplement the RF surrogate model using Scikit-learn. Nonetheless, (from your experience) have there been observations (in practice) in which splits occurred on inactive hyperparameters or information gain on the other (active) HPs have always produced higher information gain?

In the meantime, thank you again for your insights, and I eagerly await your response to the final question. Additionally, future reader should find it useful for comprehending and possibly enhancing the handling of inactive hyperparameters in SMAC. Therefore, I used the following snippet beginning with their unit testing if that could help visually visualise the imputed missing values that the surrogate RF of SMAC do, so that you can see roughly how it is done, although this is a very simplified example:

# Import the necessary here (ConfigSpace, SMAC, Rich, etc.)

def display_hyperparameter_configurations(size=10):
    def convert_configurations_to_array(configs):
        return np.array([config.get_array() for config in configs])

    # Define the configuration space
    cs = ConfigurationSpace(seed=0)

    # Algorithm hyperparameter
    algorithm = cs.add_hyperparameter(CategoricalHyperparameter("algorithm", ["decision_tree", "random_forest"]))

    # Decision Tree hyperparameters
    criterion = cs.add_hyperparameter(CategoricalHyperparameter("criterion", ["gini", "entropy"]))
    max_depth = cs.add_hyperparameter(UniformIntegerHyperparameter("max_depth", 1, 20))

    # Conditions for Decision Tree hyperparameters
    cs.add_condition(EqualsCondition(criterion, algorithm, "decision_tree"))
    cs.add_condition(EqualsCondition(max_depth, algorithm, "decision_tree"))

    # Random Forest hyperparameters
    n_estimators = cs.add_hyperparameter(UniformIntegerHyperparameter("n_estimators", 10, 200))
    max_features = cs.add_hyperparameter(CategoricalHyperparameter("max_features", ["auto", "sqrt", "log2"]))

    # Conditions for Random Forest hyperparameters
    cs.add_condition(EqualsCondition(n_estimators, algorithm, "random_forest"))
    cs.add_condition(EqualsCondition(max_features, algorithm, "random_forest"))

    # Sample configurations
    configs = cs.sample_configuration(size=size)
    config_array = convert_configurations_to_array(configs)

    model = RandomForest(configspace=cs)
    config_array = model._impute_inactive(config_array)

    hp_names = [hp.name for hp in cs.get_hyperparameters()]

    console = Console()
    table = Table(show_header=True, header_style="bold magenta")
    for name in hp_names:
        table.add_column(name)
    for config in config_array:
        table.add_row(*map(str, config))

    console.print(table)


# Call the function to display the configurations
display_hyperparameter_configurations(size=50)

Following your answer @dengdifan , consider this issue done 👍
Cheers.

from smac3.

simonprovost avatar simonprovost commented on May 24, 2024

Given the non high priorities of the remaining queries, I'll close to let other more important query to pass first. Please feel free to reopen if you have time or if any reader wishes to learn more about the two most recent questions posed.

Cheers!

from smac3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.