Coder Social home page Coder Social logo

allentune's Introduction

Allentune

Hyperparameter Search for AllenNLP

Citation

If you use this repository for your research, please cite:

@inproceedings{showyourwork,
 author = {Jesse Dodge and Suchin Gururangan and Dallas Card and Roy Schwartz and Noah A. Smith},
 title = {Show Your Work: Improved Reporting of Experimental Results},
 year = {2019},
 booktitle = {Proceedings of EMNLP},
}

Run distributed, parallel hyperparameter search on GPUs or CPUs. See the associated paper here.

This library was inspired by https://github.com/ChristophAlt/tuna, thanks to that author for their work!

To get started,

  1. First install allennlp with:

    pip install git+git://github.com/allenai/allennlp@27ebcf6ba3e02afe341a5e62cb1a7d5c6906c0c9

    Then, clone the allentune repository, cd into root folder, and run pip install --editable .

  2. Then, make sure all tests pass:

    pytest -v .

Now you can test your installation by running allentune -h.

What does Allentune support?

This library is compatible with random and grid search algorithms via Raytune. Support for complex search schedulers (e.g. Hyperband, Median Stopping Rule, Population Based Training) is on the roadmap.

How does it work?

Allentune operates by combining a search_space with an AllenNLP training config. The search_space contains sampling strategies and bounds per hyperparameter. For each assignment, AllenTune sets the sampled hyperparameter values as environment variables and kicks off a job. The jobs are queued up and executed on a GPU/CPU when available. You can specify which and how many GPUs/CPUs you'd like AllenTune to use when doing hyperparameter search.

Setup base training config

See examples/classifier.jsonnet as an example of a CNN-based classifier on the IMDB dataset. Crucially, the AllenNLP training config sets each hyperparameter value with the standard format std.extVar(HYPERPARAMETER_NAME), which allows jsonnet to instantiate the value with an environment variable.

Setup the Search space

See examples/search_space.json as an example of search bounds applied to each hyperparameter of the CNN classifier.

There are a few sampling strategies currently supported:

  1. choice: choose an element in a specified set.
  2. integer: choose a random integer within the specified bounds.
  3. uniform: choose a random float using the uniform distribution within the specified bounds.
  4. loguniform: choose a random float using the loguniform distribution within the specified bounds.

If you want to fix a particular hyperparameter, just set it as a constant in the search space file.

Run Hyperparameter Search

Example command for 30 samples of random search with a CNN classifier, on 4 GPUs:

allentune search \
    --experiment-name classifier_search \
    --num-cpus 56 \
    --num-gpus 4 \
    --cpus-per-trial 1 \
    --gpus-per-trial 1 \
    --search-space ./examples/search_space.json \
    --num-samples 30 \
    --base-config ./examples/classifier.jsonnet

To restrict the GPUs you run on, run the above command with CUDA_VISIBLE_DEVICES=xxx.

When using allentune with your own allennlp modules, run it with the --include-package xxx flag, just like you would when running the allennlp command.

The search command will output all results of experiments in the specified --logdir, default output directory is $(pwd)/logs/.

Note: You can add the --include-package XXX flag when using allentune on your custom library, just like you would with allennlp.

Search output

By default, allentune logs all search trials to a logs/ directory in your current directory. Each trial gets its own directory.

Generate a report from the search

To check progress on your search, or to check results with your search has completed, you can run allentune report.

This command will generate a dataset of resulting hyperparameter assignments and training metrics, for further analysis:

allentune report \
    --log-dir ./logs/classifier_search/ \
    --performance-metric best_validation_accuracy \
    --model cnn

This command will create a file results.jsonl in logs/classifier_search. Each line has the hyperparameter assignments and resulting training metrics from each experiment of your search.

allentune report will also tell you currently best performing model, and the path to its serialization directory.

Plot expected performance

Finally, you can also plot expected performance as a function of hyperparameter assignments or training duration. For more information on how this plot is generated, check the associated paper here.

allentune plot \
    --data-name IMDB \
    --subplot 1 1 \
    --figsize 10 10 \
    --result-file ./logs/classifier_search/results.jsonl \
    --output-file ./classifier_performance.pdf \
    --performance-metric-field best_validation_accuracy \
    --performance-metric accuracy

Sample more hyperparameters until this curve converges to some expected validation performance!

allentune's People

Contributors

kernelmachine avatar schmmd avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.