Implementation of the Proximal Policy Optimization Reinforcement Learning algorithm, uses DeepMind's StarCraft II Learning Environment for the variety of mini-games that it provides. Note that the Makefile targets should be used for all operations regarding this project, such as installing, training and evaluating models. Running the Python scripts directly will not invoke required dependency operations.
The installation requires agreement to the terms of BLIZZARD STARCRAFT II AI AND MACHINE LEARNING LICENSE, by typing in the password 'iagreetotheeula' during the installation process you agree to be bound by these terms.
ARCHER2:
make install_archer2
Cirrus:
make install_cirrus
Locally:
Make sure Python3.9 is installed and callable
make install_local
Note that pip may throw a recommendation to update warning, however this should be ignored as the installation script downgrades pip to satisfy specific dependencies.
The repository provides the best trained model on DefeatZerglingsAndBanelings for evaluation, as the default setting. Note that the visualizations will not be rendered realtime and hence will be fast, this is due to a limitation of PySC2 which is unable to render realtime and remain deterministic. Hence with the realtime setting results are not reproducible, and is thus avoided.
(Optional) Select model in configs/evaluate_config.py using the 'CHECK_LOAD' parameter (make sure it exists in checkpoints/) and adjust the environment with 'MINIGAME_NAME' if necessary.
make evaluate
(Optional) Modify 'config/train_archer2_config.py' to adjust hyperpameters, distributed training, policy model, pseudorandom seeds etc.
Adjust 'scripts/ARCHER2.slurm' with your ARCHER2 account id
make train_ARCHER2
Saved models will be periodically saved in 'checkpoints/'
(Optional) Modify 'config/train_cirrus_config.py' to adjust hyperpameters, distributed/gpu training, policy model, pseudorandom seeds etc
Adjust 'scripts/ARCHER2.slurm' with your ARCHER2 account id
make train_Cirrus
Saved models will be periodically saved in 'checkpoints/'
(Optional) Modify 'config/train_local_config.py' to adjust hyperpameters, distributed/gpu training, policy model, pseudorandom seeds etc
(Optional) To configure number of parallel agents modify '--nproc_per_node=' in the Makefile under the 'train_local' target
make train_local
Saved models will be periodically saved in 'checkpoints/'
make regression_test
Minutes/
- Formal write-ups of meetings, containing points of discussion, updates and actions to be completed.
evaluate.py
- Evaluates a model checkpoint.
train.py
- Entry point for the training procedure.
checkpoints/
- Saved models or models to be evaluated location.
scripts/
- SLURM job scripts for ARCHER2 and Cirrus work launching.
configs/
- Configuration files for various Makefile target operations.
data/
- Data from experiment/SLURM runs.
src/
Config.py
- Central file for project configuration, should allow to modify any desired settings.
Misc.py
- Miscellaneous and helpers functions.
Parallel.py
- Responsible for providing parallel functionality wrappers to the agent policy and hence to be trained on multi-core/gpu systems.
rl/
Approximator.py
- Atari-net and FullyConv agent policy implementations.
Loop
- Training and evaluation loop implementations, responsible for tying together all the RL components.
starcraft/
Agent.py
- Responsible for updating the agent policy by piping feedback from the training environment in the form of scalar rewards.
Environment.py
- Setup for the StarCraft II environments, configuring the mini-game type, rules and feature/action space.
test/
oracle/
- Snapshot of the project implementation.
test_oracle.py
- Regression testing framework evaluator.
Makefile
- Configuration file for
make
containing various helper routines.
- Configuration file for