huntermcgushion / hyperparameter_hunter Goto Github PK

Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries

License: MIT License

Python 99.28% Makefile 0.05% Shell 0.68%

artificial-intelligence machine-learning hyperparameter-optimization hyperparameter-tuning neural-network keras scikit-learn xgboost lightgbm catboost deep-learning data-science python rgf sklearn optimization experimentation feature-engineering ai ml

hyperparameter_hunter's Issues

Documentation for `optimization_core.InformedOptimizationProtocol`

Add documentation for following optimization_core.InformedOptimizationProtocol methods:
- _set_hyperparameter_space
- _prepare_estimator
- _execute_experiment
- _get_current_hyperparameters
- _find_similar_experiments
- _validate_parameters

`AggregatorEpochsElapsed` breaks with repeated CV schemes

callbacks.aggregators.AggregatorEpochsElapsed needs to work with repeated cross-validation schemes
Currently only considers folds, and runs, per todo comments

Keras learning rate recorded incorrectly when decay/scheduling callbacks used

See :meth:recorders.DescriptionRecorder.format_result
Model's get_config() returns the final learning rate, rather than the initial one, so experiment description files are misleading by not displaying the actual value used
Leads to failed similar experiment matches
Experiment started with Adam at lr=0.001, and ReduceLROnPlateau, which dropped the lr down to 0.0001
- 0.0001 was recorded as the Experiment's lr, but it should be 0.001
Probably need to call parameterize_compiled_keras_model immediately after initializing it, then store the results, then use them in the DescriptionRecorder
- Midway through experiments.BaseCVExperiment.cv_run_workflow, or in models.KerasModel.initialize_model/fit

Remove Keras dependence in `key_handler`

Remove dependence on keras.callbacks.Callback
Only usage in key_handler.KeyMaker.handle_complex_types.visit function
Probably need to wire in import hooks, since Keras actually should be used here if a Keras model_initializer is given

Documentation for `experiment_core.ExperimentMeta`

Add documentation for experiment_core.ExperimentMeta

Optional `model_init_params` kwarg for `experiments.BaseExperiment.init`

Make the model_init_params kwarg of experiments.BaseExperiment.__init__ optional
If not provided, the algorithm's standard defaults set by the library should be used

Documentation for `optimization_core.UninformedOptimizationProtocol`

Finish documentation for optimization_core.UninformedOptimizationProtocol._get_current_hyperparameters
This can only be completed once #25 has been resolved

Environment params path example

Finish examples/environment_params_path_example.py

Leaderboard conflict with aliased, and non-aliased metrics

Resolve issue noted in leaderboards.GlobalLeaderboard.add_entry, where aliased metrics names should be merged together based on their equivalent hashes
Currently, using SKLearn's roc_auc_score, then using the same function under an alias (like 'roc') would produce two separate columns: 'roc_auc_score', and 'roc'
- This is despite the fact that the two metrics are, in fact, the same thing

Confusion matrix using `lambda_callback`

Add confusion_matrix callback that uses callbacks.bases.lambda_callback
Include the confusion_matrix callback in examples.lambda_callback_example

Documentation for `models.Model`

Finish documentation for the following methods of models.Model:
- __init__
- extra_params
- fit
- predict

Finish `leaderboards` documentation

Add documentation description for leaderboards.Leaderboard.__init__
Add documentation for leaderboards.GlobalLeaderboard.add_entry (See Leaderboard implementation)

Documentation for `key_handler.`<`CrossExperimentKeyMaker`, `HyperparameterKeyMaker`>

Add documentation for the __init__ method of key_handler.CrossExperimentKeyMaker, and KeyHandler.HyperparameterKeyMaker
- See #12
Add documentation for key_handler.HyperparameterKeyMaker.filter_parameters_to_hash (See key_handler.KeyMaker implementation)

UninformedOptimizationProtocols need `current_hyperparameters_list`

Add current_hyperparameters_list equivalent to optimization_core.UninformedOptimizationProtocol
See usages in optimization_core.InformedOptimizationProtocol for proper implementation
Only used by optimization_core.BaseOptimizationProtocol for logging in the _optimization_loop method (in which pertinent flag comments are located)
This bug breaks the children of UninformedOptimizationProtocol, which is very not good

`n_random_starts` broken in `optimization_core.SKOptimizationProtocol.init`

Make optimization_core.SKOptimizationProtocol.__init__.n_random_starts actually do something when specified
The kwarg is currently ignored if a sufficient number of experiment results have already been read in
- This makes the SKOptimizationProtocol think the requirement has already been satisfied
Random starts are only actually executed when n_random_starts-many result files cannot be located

Hide internally-used `experiments.BaseExperiment` methods

Hide the following methods of experiments.BaseExperiment that generally shouldn't be used by class instances:
- additional_preparation_steps
- initial_preprocessing
- validate_parameters
- validate_environment
- clean_up
- generate_experiment_id
- generate_hyperparameter_key
- create_script_backup
- initialize_random_seeds
- random_seed_initializer
- update_model_params

Support 'EIps', and 'PIps' `acquisition_function` in `SKOptimizationProtocol`

In optimization_core.SKOptimizationProtocol.__init__, add support for kwarg acquisition_function in ['EIps', 'PIps']
Will require execution times to be returned somewhere

Documentation for `exception_handler` custom exception classes

Add documentation for the following custom exceptions defined in exception_handler: EnvironmentInactiveError, EnvironmentInvalidError, RepeatedExperimentError

Add default hyperparameter search ranges

Declare default hyperparameter ranges/selections for certain libraries/algorithms in files named for each library in the hyperparameter_hunter/library_helpers directory
These should be used by optimization_core.BaseOptimizationProtocol.add_default_options when completed by #31

Perform Keras layer interception in project's `init.py`

Perform call to importer.hook_keras_layer near top of __init__.py
Currently called before any other imports
- See examples.keras_example.py for current usage - This will need to be removed
Verify hook_keras_layer does not raise any exceptions if Keras has not been installed

Documentation for `key_handler.KeyMaker` methods

Add documentation for the following methods of key_handler.KeyMaker:
- __init__ (should be extensive)
- handle_complex_types's visit function
- key_type

Documentation for `models.KerasModel`

Finish documentation for the following methods of models.KerasModel:
- __init__.
  - initialization_params, and extra_params kwargs
- initialize_model
- fit,
- get_input_dim
- validate_keras_params
- initialize_keras_neural_network

Documentation for `recorders.BaseRecorder.init`

Add documentation for recorders.BaseRecorder.__init__

Documentation for `reporting.OptimizationReporter`

Add documentation for the following reporting.OptimizationReporter methods:
- print_column_name
- print_target_value
- print_input_values
- reset_timer
- print_summary
- print_header (parameters section)

`tell` optimizer positive/negative utility values depending on `target_metric`

Update the following methods of optimization_core.InformedOptimizationProtocol:
- _execute_experiment
- _find_similar_experiments
The two aforementioned methods are the two locations at which optimization_core.InformedOptimizationProtocol.optimizer is "tell-ed" the utility value of a set of hyperparameters
Currently, a negative utility value is provided to optimizer, which will cause problems if target_metric should be minimized
- This is the case when target_metric is some loss measure
Need to add a means of specifying that positive utility values should be used, instead of negative, or of detecting that target_metric measures loss

Documentation for `reporting` helper functions

Add documentation for the following reporting functions:
- format_frame_source
- stringify_frame_source
- add_time_to_content
- format_fold_run
- format_evaluation_results

Print `experiment_id` in `reporting.OptimizationReporter.print_result`

Add option to print the experiment_id (or first several characters) to reporting.OptimizationReporter.print_result, per todo comment
Include parameter to disable this, or adjust whether the full id is shown vs partial

Separate input/target data for `environment.Environment.init`

Add ability to provide separate input/target DataFrames for following environment.Environment.__init__ kwargs: train_dataset, holdout_dataset, and test_dataset
Accept NumPy arrays, instead of DataFrames
Alternative to providing the whole DataFrame, containing a target column

`models.XGBoostModel.fit` `eval_set` behavior

Remove the default inclusion of eval_set in models.XGBoostModel.fit per todo comment
This results in unexpectedly long execution times
models.XGBoostModel.fit has been commented out, meaning models.Model.fit is being used
The updated version of models.XGBoostModel.fit should still accommodate eval_set and eval_metric arguments

Hide internally-used `reporting.OptimizationReporter` methods

Hide the following internally-used methods of reporting.OptimizationReporter
- print_column_name
- print_target_value
- print_input_values

Documentation for `optimization_core.BaseOptimizationProtocol`

Add documentation for following optimization_core.BaseOptimizationProtocol methods:
- _optimization_loop
- _update_current_hyperparameters
- _set_hyperparameter_space
- _get_current_hyperparameters
- search_space_size (See InformedOptimizationProtocol implementation)

Implement `optimization_core.BaseOptimizationProtocol.add_default_options`

Complete the optimization_core.BaseOptimizationProtocol.add_default_options method
This will need to play nice with the BaseOptimizationProtocol.hyperparameter_space attribute
- Likely requires updating space.Space to reflect new default options being added to original dimensions (if InformedOptimizationProtocol)
The implemented add_default_options should leverage the default hyperparameter search ranges added in #30 for the hyperparameter provided as input and optimization_core.BaseOptimizationProtocol.model_initializer

Reorganize `optimization_core.InformedOptimizationProtocol.init` arguments

Make dimensions the first argument of optimization_core.InformedOptimizationProtocol.__init__
Convert dimensions to a required argument, rather than an optional kwarg
Reorder documentation as necessary

`environment.Environment.init` `environment_params_path` usage example

Add documentation example for overriding environment.Environment.__init__ params with kwargs and the order of precedence for environment_params_path arguments
Add example script in examples dir, and refer to it in the documentation for environment_params_path

Hide internally-used `reporting.ReportingHandler` methods

Hide the following internally-used methods of reporting.ReportingHandler:
- validate_parameters
- configure_reporting_type
- initialize_logging_logging
- configure_console_logger_handler
- configure_heartbeat_logger_handler

Clean up `optimization_utils.AskingOptimizer.init`

Problem: skopt.optimizer.Optimizer.__init__ is copied almost verbatim by optimization_utils.AskingOptimizer.__init__, which is far from ideal
- This is copied in order to make AskingOptimizer use hyperparameter_hunter.space.Space, rather than skopt.space.Space
Need way to tell skopt.optimizer.Optimizer.__init__ to use updated Space, or need to override the particular section of skopt.optimizer.Optimizer.__init__, in which skopt.space.Space is used
In its current state, any changes to skopt.optimizer.Optimizer.__init__ will be completely lost, and will need to be manually recreated
Solution still needs to accommodate __repeated_ask_kwargs, as noted in the pertinent todo comments and the original optimization_utils.AskingOptimizer.__init__, which is commented out above the current monstrosity

Create Keras hyperparameter keys using compiled model architecture instead of `build_fn`

:meth:experiments.BaseExperiment._generate_hyperparameter_key
Tie this into solution for bug regarding recording lr decay (#50)
Create Keras key based on compiled model architecture, which is far more accurate
This will improve duplicate experiment detection, since two different build_fn s can yield the same architecture

Documentation for `exception_handler.`<`handle_exception`, `hook_exception_handler`>

Add documentation for exception_handler.<handle_exception, hook_exception_handler> functions

Keras layer interception location varies by version

In importer.hook_keras_layer, check Keras version
Based on version, provide the appropriate module name to Interceptor to insert in meta_path

Custom Metrics Example

Finish examples/custom_metrics_example.py

Add summary for `reporting.OptimizationReporter`

Implement the reporting.OptimizationReporter.print_summary method
This method should print a summary of the results of an optimization protocol upon completion

Documentation for `metrics.get_clean_predictions`

Add documentation for the function metrics.get_clean_predictions

Finish `experiments.BaseExperiment.init` documentation

Add documentation for the target_metric kwarg of experiments.BaseExperiment.__init__
Label the following experiments.BaseExperiment.__init__ kwargs as experimental while in development: preprocessing_pipeline, preprocessing_params

Documentation for `reporting.ReportingHandler`

Add documentation for the following reporting.ReportingHandler methods:
- validate_parameters
- configure_reporting_type
- initialize_logging_logging
- configure_console_logger_handler
- configure_heartbeat_logger_handler
- _logging_log
- _logging_debug
- _logging_warn

Clarify metrics kwargs of `environment.Environment.init`

Reorder kwargs of environment.Environment.__init__ to place metrics_map/metrics_params closer to the top, since one of them is required.
Make it clearer in documentation that one of them must be provided.

Hide internally-used `filter_parameters_to_hash` method of `key_handler` classes

Hide the filter_parameters_to_hash method of the following classes:
- key_handler.KeyMaker
- key_handler.HyperparameterKeyMaker

`random_state` to `space.Space.init` breaks repeated `space.Space.rvs` calls

Provide means to supply random_state to space.Space.__init__
Set space.Space.space_random_state to the given random_state, so repeated calls to space.Space.rvs doesn't get messed up, per flag comment
Currently space.Space.space_random_state is not changeable

Keras dependence in `models`

Remove Keras dependence in models, unless keras.models.load_model required by models.KerasModel.fit
- This will only be the case if models.KerasModel is actually in use
May need to use Keras import hooks from importer inside hyperparameter_hunter.__init__

Loading from model checkpoints in `models.KerasModel.fit`

Clean up the loading of Keras model checkpoint files in models.KerasModel.fit
Verify this is working properly (per flag comments) after changing handling of Keras callbacks

Hide internally-used `initialize_folds` method in Experiment classes

Hide the initialize_folds method implemented by the following classes:
- experiments.BaseCVExperiment
- experiments.CrossValidationExperiment
- experiments.RepeatedCVExperiment
- experiments.StandardCVExperiment

huntermcgushion / hyperparameter_hunter Goto Github PK

hyperparameter_hunter's Issues

Recommend Projects

Recommend Topics

Recommend Org