I would like to express my appreciation for the outstanding work done on the new ite

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Handling of Inactive Hyperparameters in SMAC’s Surrogate Models about smac3 HOT 3 CLOSED

simonprovost commented on May 24, 2024

Handling of Inactive Hyperparameters in SMAC’s Surrogate Models

from smac3.

Comments (3)

dengdifan commented on May 24, 2024

Hi @simonprovost
thanks for the information.

How are inactive hyperparameters in the dataset provided to the random forest's surrogate in SMAC managed precisely? Are they executed as described above? Following this Github issue, would you be open to a PR in the FAQ explaining how they are handled?

The main reason that we impute NaN value for RF is that our RF surrogate model is built based on the pyrfr package written in C++ and wrapped by swig.
NaN values might not be easily transferred to the corresponding C++ variants in swig. Therefore we need to impute those values.

Why is the number of options the choices offer is used to represent inactive categorical hyperparameters? What is the logic behind this decision?

In ConfigSpace, categorical HPs are encoded as Numerical Encoding ([0, 1, 2, ... n_opts - 1]), as shown in this line. Therefore, a categorical HP will never select a HP whose vector value is n_opts if it is an activate HP

Why is -1 used for inactive float/integer hyperparameters, and what effect does this decision have on the model? Is -1 not regarded as one of the options? Or, as I have observed elsewhere, float/integer hyperparameters are rescaled? If so, could you please provide further explanation for this type of inactive hyperparameters?

Similar to categorical HPs, Numerical HPs (float & int) are represented as vectors within [0,1] (This normalization method is also used in GP models). Therefore, they will never select -1 if the HPs are activate

Were there any considerations for modifying the decision tree splitting criteria to manage inactive hyperparameters based on a flag or other manners, as opposed to using placeholders to falsify the decision tree with almost to no information gain for these hyperparameters?

As soon as our surrogate models are based on pyrfr, this might not be easily implemented.

Would you confirm that following https://github.com/scikit-learn/scikit-learn/pull/23595is not going to influence with these potential inactive hyperparameters that can be actually seen as missing values ? Given your extra layer of missing value imputation, I reckon this to be not an issue, yet always great to extra confirm.

We also considering reimplementing our RF models based on SKLearn's models, however, I cannot promise the exact time that would happen.,

Is there a method to print the input data given to the surrogate using the API? In order for us to have a visual interpretation. If not, could we be directed to a good starting point for printing in the code following a fork of SMAC?

for a configuration, you can simply call config.get_array() to get its numerical representation.

Hopes that answers all your questions

from smac3.

simonprovost commented on May 24, 2024

Hi @dengdifan,

I greatly appreciate your detailed response; it has helped clarify numerous aspects. I understand that placeholder values outside the range of their active counterparts are assigned to inactive hyperparameters to prevent them from significantly influencing the surrogate model.

From the discussion, it appears that the placeholder values are not likely to be selected or to lead to meaningful splits in the decision trees of the surrogate model. E.g., Due to the uniformity and lack of correlation between these placeholders and the target values, the information gain from splitting on these values is typically low, particularly when working with a configuration-based dataset that is densely populated.

Nonetheless, I am curious about two things:

Conceptually speaking this is confusing to not use the same placeholder while however you encode every HPs, is there a reason to that?
Whether explicitly preventing splits on inactive hyperparameters could be advantageous, in terms of potential forbidden split and tree construction time efficiency. If, for example, the trees of the surrogate model could be constrained to only consider splits within the valid range of active hyperparameter values, this could prevent the improbable but possible occurrence of a less informative split based on placeholder (i.e., inactive HP) values. In addition, this may be one more reason to investigate the use of Scikit-learn's trees, as you mentioned there are considerations to reimplement the RF surrogate model using Scikit-learn. Nonetheless, (from your experience) have there been observations (in practice) in which splits occurred on inactive hyperparameters or information gain on the other (active) HPs have always produced higher information gain?

In the meantime, thank you again for your insights, and I eagerly await your response to the final question. Additionally, future reader should find it useful for comprehending and possibly enhancing the handling of inactive hyperparameters in SMAC. Therefore, I used the following snippet beginning with their unit testing if that could help visually visualise the imputed missing values that the surrogate RF of SMAC do, so that you can see roughly how it is done, although this is a very simplified example:

# Import the necessary here (ConfigSpace, SMAC, Rich, etc.)

def display_hyperparameter_configurations(size=10):
    def convert_configurations_to_array(configs):
        return np.array([config.get_array() for config in configs])

    # Define the configuration space
    cs = ConfigurationSpace(seed=0)

    # Algorithm hyperparameter
    algorithm = cs.add_hyperparameter(CategoricalHyperparameter("algorithm", ["decision_tree", "random_forest"]))

    # Decision Tree hyperparameters
    criterion = cs.add_hyperparameter(CategoricalHyperparameter("criterion", ["gini", "entropy"]))
    max_depth = cs.add_hyperparameter(UniformIntegerHyperparameter("max_depth", 1, 20))

    # Conditions for Decision Tree hyperparameters
    cs.add_condition(EqualsCondition(criterion, algorithm, "decision_tree"))
    cs.add_condition(EqualsCondition(max_depth, algorithm, "decision_tree"))

    # Random Forest hyperparameters
    n_estimators = cs.add_hyperparameter(UniformIntegerHyperparameter("n_estimators", 10, 200))
    max_features = cs.add_hyperparameter(CategoricalHyperparameter("max_features", ["auto", "sqrt", "log2"]))

    # Conditions for Random Forest hyperparameters
    cs.add_condition(EqualsCondition(n_estimators, algorithm, "random_forest"))
    cs.add_condition(EqualsCondition(max_features, algorithm, "random_forest"))

    # Sample configurations
    configs = cs.sample_configuration(size=size)
    config_array = convert_configurations_to_array(configs)

    model = RandomForest(configspace=cs)
    config_array = model._impute_inactive(config_array)

    hp_names = [hp.name for hp in cs.get_hyperparameters()]

    console = Console()
    table = Table(show_header=True, header_style="bold magenta")
    for name in hp_names:
        table.add_column(name)
    for config in config_array:
        table.add_row(*map(str, config))

    console.print(table)


# Call the function to display the configurations
display_hyperparameter_configurations(size=50)

Following your answer @dengdifan , consider this issue done 👍
Cheers.

from smac3.

simonprovost commented on May 24, 2024

Given the non high priorities of the remaining queries, I'll close to let other more important query to pass first. Please feel free to reopen if you have time or if any reader wishes to learn more about the two most recent questions posed.

Cheers!

from smac3.

Handling of Inactive Hyperparameters in SMAC’s Surrogate Models about smac3 HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent