Coder Social home page Coder Social logo

sc2_ppo_distributed's Introduction

PPO on StarCraft II LE

Implementation of the Proximal Policy Optimization Reinforcement Learning algorithm, uses DeepMind's StarCraft II Learning Environment for the variety of mini-games that it provides. Note that the Makefile targets should be used for all operations regarding this project, such as installing, training and evaluating models. Running the Python scripts directly will not invoke required dependency operations.

Linux installation:

The installation requires agreement to the terms of BLIZZARD STARCRAFT II AI AND MACHINE LEARNING LICENSE, by typing in the password 'iagreetotheeula' during the installation process you agree to be bound by these terms.

ARCHER2:

  • make install_archer2

Cirrus:

  • make install_cirrus

Locally:

  • Make sure Python3.9 is installed and callable
  • make install_local

Note that pip may throw a recommendation to update warning, however this should be ignored as the installation script downgrades pip to satisfy specific dependencies.

Evaluating a model locally:

The repository provides the best trained model on DefeatZerglingsAndBanelings for evaluation, as the default setting. Note that the visualizations will not be rendered realtime and hence will be fast, this is due to a limitation of PySC2 which is unable to render realtime and remain deterministic. Hence with the realtime setting results are not reproducible, and is thus avoided.

  1. (Optional) Select model in configs/evaluate_config.py using the 'CHECK_LOAD' parameter (make sure it exists in checkpoints/) and adjust the environment with 'MINIGAME_NAME' if necessary.
  2. make evaluate

Training on ARCHER2:

drawing

  1. (Optional) Modify 'config/train_archer2_config.py' to adjust hyperpameters, distributed training, policy model, pseudorandom seeds etc.
  2. Adjust 'scripts/ARCHER2.slurm' with your ARCHER2 account id
  3. make train_ARCHER2
  4. Saved models will be periodically saved in 'checkpoints/'

Training on Cirrus:

drawing

  1. (Optional) Modify 'config/train_cirrus_config.py' to adjust hyperpameters, distributed/gpu training, policy model, pseudorandom seeds etc
  2. Adjust 'scripts/ARCHER2.slurm' with your ARCHER2 account id
  3. make train_Cirrus
  4. Saved models will be periodically saved in 'checkpoints/'

Training locally (not recommended):

  1. (Optional) Modify 'config/train_local_config.py' to adjust hyperpameters, distributed/gpu training, policy model, pseudorandom seeds etc
  2. (Optional) To configure number of parallel agents modify '--nproc_per_node=' in the Makefile under the 'train_local' target
  3. make train_local
  4. Saved models will be periodically saved in 'checkpoints/'

Running regression tests:

  • make regression_test

Directories overview:

  • Minutes/
    • Formal write-ups of meetings, containing points of discussion, updates and actions to be completed.
  • evaluate.py
    • Evaluates a model checkpoint.
  • train.py
    • Entry point for the training procedure.
  • checkpoints/
    • Saved models or models to be evaluated location.
  • scripts/
    • SLURM job scripts for ARCHER2 and Cirrus work launching.
  • configs/
    • Configuration files for various Makefile target operations.
  • data/
    • Data from experiment/SLURM runs.
  • src/
    • Config.py
      • Central file for project configuration, should allow to modify any desired settings.
    • Misc.py
      • Miscellaneous and helpers functions.
    • Parallel.py
      • Responsible for providing parallel functionality wrappers to the agent policy and hence to be trained on multi-core/gpu systems.
    • rl/
      • Approximator.py
        • Atari-net and FullyConv agent policy implementations.
      • Loop
        • Training and evaluation loop implementations, responsible for tying together all the RL components.
    • starcraft/
      • Agent.py
        • Responsible for updating the agent policy by piping feedback from the training environment in the form of scalar rewards.
      • Environment.py
        • Setup for the StarCraft II environments, configuring the mini-game type, rules and feature/action space.
  • test/
    • oracle/
      • Snapshot of the project implementation.
    • test_oracle.py
      • Regression testing framework evaluator.
  • Makefile
    • Configuration file for make containing various helper routines.

sc2_ppo_distributed's People

Contributors

kuzywoozy avatar

Stargazers

Hristo Belchev avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.