UCBC is a multi-armed bandit algorithm that use a variance-based upper confidence bound. For the UCBC algorithm, please see ucbc.pdf. This repo is the source code for the paper.
- solvers: contains the multi-armed bandit solvers
- bandits: contains the multi-armed bandit problems
- experiments: predefined experiments
- run.py: the script for running the predefined experiments to compare UCBC with UCB1
- results: the result folder (will be autogenerated)
- the .dat files can be loaded to regenerate figures using Experiment.load()
-
while running the run.py, a live figure is produced. Be aware that this takes a huge amount of memory (in 10 GB scale) if steps x episodes are large (in 10 million scale).
-
the code and the UCBC paper (ucbc.pdf) is subject to minor changes and error corrections.