Coder Social home page Coder Social logo

joanvaquer / btb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mlbazaar/btb

0.0 0.0 0.0 16.17 MB

A simple, extensible library for developing AutoML systems

Home Page: https://hdi-project.github.io/BTB/

License: MIT License

Python 78.33% Makefile 1.59% Jupyter Notebook 19.94% Dockerfile 0.14%

btb's Introduction

BTB An open source project from Data to AI Lab at MIT.

A simple, extensible backend for developing auto-tuning systems.

Development Status PyPi Shield Travis CI Shield Coverage Status Downloads Binder

Overview

BTB ("Bayesian Tuning and Bandits") is a simple, extensible backend for developing auto-tuning systems such as AutoML systems. It provides an easy-to-use interface for tuning models and selecting between models.

It is currently being used in several AutoML systems:

Try it out now!

If you want to quickly discover BTB, simply click the button below and follow the tutorials!

Binder

Install

Requirements

BTB has been developed and tested on Python 3.5, 3.6 and 3.7

Also, although it is not strictly required, the usage of a virtualenv is highly recommended in order to avoid interfering with other software installed in the system where BTB is run.

Install with pip

The easiest and recommended way to install BTB is using pip:

pip install baytune

This will pull and install the latest stable release from PyPi.

If you want to install from source or contribute to the project please read the Contributing Guide.

Quickstart

In this short tutorial we will guide you through the necessary steps to get started using BTB to select between models and tune a model to solve a Machine Learning problem.

In particular, in this example we will be using BTBSession to perform solve the Wine classification problem by selecting between the DecisionTreeClassifier and the SGDClassifier models from scikit-learn while also searching for their best hyperparameter configuration.

Prepare a scoring function

The first step in order to use the BTBSession class is to develop a scoring function.

This is a Python function that, given a model name and a hyperparameter configuration, evaluates the performance of the model on your data and returns a score.

from sklearn.datasets import load_wine
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import f1_score, make_scorer
from sklearn.model_selection import cross_val_score
from sklearn.tree import DecisionTreeClassifier


dataset = load_wine()
models = {
    'DTC': DecisionTreeClassifier,
    'SGDC': SGDClassifier,
}

def scoring_function(model_name, hyperparameter_values):
    model_class = models[model_name]
    model_instance = model_class(**hyperparameter_values)
    scores = cross_val_score(
        estimator=model_instance,
        X=dataset.data,
        y=dataset.target,
        scoring=make_scorer(f1_score, average='macro')
    )
    return scores.mean()

Define the tunable hyperparameters

The second step is to define the hyperparameters that we want to tune for each model as Tunables.

from btb.tuning import Tunable
from btb.tuning import hyperparams as hp

tunables = {
    'DTC': Tunable({
        'max_depth': hp.IntHyperParam(min=3, max=200),
        'min_samples_split': hp.FloatHyperParam(min=0.01, max=1)
    }),
    'SGDC': Tunable({
        'max_iter': hp.IntHyperParam(min=1, max=5000, default=1000),
        'tol': hp.FloatHyperParam(min=1e-3, max=1, default=1e-3),
    })
}

Start the searching process

Once you have defined a scoring function and the tunable hyperparameters specification of your models, you can start the searching for the best model and hyperparameter configuration by using the btb.BTBSession.

All you need to do is create an instance passing the tunable hyperparameters scpecification and the scoring function.

from btb import BTBSession

session = BTBSession(
    tunables=tunables,
    scorer=scoring_function
)

And then call the run method indicating how many tunable iterations you want the BTBSession to perform:

best_proposal = session.run(20)

The result will be a dictionary indicating the name of the best model that could be found and the hyperparameter configuration that was used:

{
    'id': '826aedc2eff31635444e8104f0f3da43',
    'name': 'DTC',
    'config': {
        'max_depth': 21,
        'min_samples_split': 0.044010284821858835
    },
    'score': 0.907229308339589
}

How does BTB perform?

We have a comprehensive benchmarking framework that we use to evaluate the performance of our Tuners. For every release, we perform benchmarking against 100's of challenges, comparing tuners against each other in terms of number of wins. We present the latest leaderboard from latest release below:

Number of Wins on latest Version

tuner with ties without ties
Ax.optimize 237 39
BTB.GPEiTuner 233 19
BTB.GPTuner 235 25
BTB.UniformTuner 197 2
HyperOpt.tpe 206 11
SMAC.HB4AC 192 1
SMAC.SMAC4HPO_EI 241 36
SMAC.SMAC4HPO_LCB 222 17
SMAC.SMAC4HPO_PI 241 37
  • Detailed results from which this summary emerged are available here.
  • If you want to compare your own tuner, follow the steps in our benchmarking framework here.
  • If you have a proposal for tuner that we should include in our benchmarking get in touch with us at [email protected].

More tutorials

  1. To just tune hyperparameters - see our tuning tutorial here and documentation here.
  2. To see the types of hyperparameters we support see our documentation here.
  3. You can read about our benchmarking framework here.
  4. See our tutorial on selection here and documentation here.

For more details about BTB and all its possibilities and features, please check the project documentation site!

Also do not forget to have a look at the notebook tutorials.

Citing BTB

If you use BTB, please consider citing the following paper:

@article{smith2019mlbazaar,
  author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},
  title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},
  journal = {arXiv e-prints},
  year = {2019},
  eid = {arXiv:1905.08942},
  pages = {arxiv:1904.09535},
  archivePrefix = {arXiv},
  eprint = {1905.08942},
}

btb's People

Contributors

pvk-developer avatar csala avatar micahjsmith avatar lauragustafson avatar kveerama avatar alfredo-cuesta avatar wavescholar avatar jdtheripperpc avatar wsnalice avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.