Coder Social home page Coder Social logo

ixanezis / othello-zero Goto Github PK

View Code? Open in Web Editor NEW

This project forked from 2bear/othello-zero

0.0 0.0 0.0 3.99 MB

An implementation of the AlphaGo Zero and the AlphaZero algorithm for othello playing.

License: MIT License

Python 88.16% Jupyter Notebook 8.16% Dockerfile 2.30% Shell 1.37%

othello-zero's Introduction

othello-zero

About

othello-zero is an implementation of the AlphaGo Zero and the AlphaZero algorithm for othello playing. The main idea comes from these papers by DeepMind:

I also referred some great posts:

Thanks

icodeface: Resolved a issue of compiling Edax, and sponsor this project generously.

Environment

  • python 3.5 64-bit
  • tensorflow or tensorflow-gpu 1.11

Usage

You can use -h option for more help.

python othello.py -h

Learning loop

A learning loop comes in two parts: self_play and train. self_play generates newest games by using the latest neural network model. train generates the next generation of model by solely learning newest games.

python othello.py --learning-loop

I'm using a sigle PC packing a 1080Ti GPU, a 4-Core i7 CPU and 32G memory to run this project. Here are my config in config.py:

# learning config
self_play_wokers_num = 8
self_play_woker_gpu_memory_fraction = 0.04
self_play_batch_size = 128
self_play_echo_max = 2
train_batch_size = 128
train_echo_max = 50
learning_loop_echo_max = 6

On my computer, two self_play echos cost about a hours to generate 2048(2×8×128) games. These wokers occupy ~17.6G(8×2.2G) memory. Then train costs ~12 minutes to take ~50K steps. Whole loop costs about 72 minutes to produce new version of othello-zero.

Play with Edax

Edax is one of the strongest othello programs in the world, written by abulmo. It was trained with about 800K games, both various in the opening and of high quality in the endgame.

Paweł Liskowski wrote a paper called Learning to Play Othello with Deep Neural Networks. He trained several CNN architectures to challenge Edax. This is a good reference on how powerful Edax is.

Edax is not a part of this project. You need go HERE, click Clone or download button, download the source and build it. Don't download release version 4.4. It dosen't work.

cd edax/project/path
mkdir -p bin

cd src
make build ARCH=x64 COMP=your-compiler OS=your-os

Then go HERE, download eval.7z, unzip to the directory where the Edax executable is. Next, copy them all to edax folder in this project.

├── ...
├── log
│   ├── ...
│   └── ...
├── edax
│   ├── data
│   │   └── eval.dat
│   └── Edax
├── othello.py
├── tree.py
├── ...

OK

python othello.py --play-with-edax

Play with human

You can play with othello-zero on the command line.

python othello.py --play-with-human

Then

=================
  A B C D E F G H
1 ┌─┬─┬─┬─┬─┬─┬─┐
2 ├─┼─┼─┼─┼─┼─┼─┤
3 ├─┼─┼─┼─┼─┼─┼─┤
4 ├─┼─┼─○─●─×─┼─┤
5 ├─┼─┼─●─●─●─┼─┤
6 ├─┼─┼─×─┼─×─┼─┤
7 ├─┼─┼─┼─┼─┼─┼─┤
8 └─┴─┴─┴─┴─┴─┴─┘

●black plays F5.
it's ○white turn.
>

Just type your move, such as F4, and press Enter.

Milestone

Date othello-zero version Achievement
2018-11-28 V1 -> V6 None
2018-11-29 V7 -> V12 V10 defeated me :-)
2018-11-30 V13 -> V18 V14 defeated Edax Lv.1
2018-12-01 V19 -> V24 None
2018-12-02 V25 -> V30 V26 defeated Edax Lv.2
2018-12-03 V31 -> V36 Abandon
2018-12-04 V31 -> V36 None
2018-12-05 V37 -> V42 None
2018-12-06 V43 -> V48 V43 defeated Edax Lv.3
2018-12-07 V49 -> V54 None
2018-12-08 V55 -> V57 None
2018-12-09 V58 -> V63 None
2018-12-10 V64 -> V69 V67 defeated Edax Lv.4
2018-12-11 V70 -> V78 None
2018-12-12 V79 -> V87 None
2018-12-13 V88 -> V96 None
2018-12-14 V97 -> V105 None
2018-12-15 V106 -> V114 None
2018-12-16 V115 -> V117 Suspended

2018-12-03: train_echo_max was changed from 50 to 100, but I'm not satisfied with the result, so then undid this change the next day.

2018-12-08: self_play_echo_max was changed from 2 to 4.

2018-12-09: train_echo_max was changed from 50 to 100.

2018-12-16: othello-zero is close to Edax Lv.5, but hard to defeat it. I suspend this project.

Data

On the releases page you can download:

  • All key checkpoints, such as V1, V10, V20 and so on.

  • Whole loss-log file includes policy-loss and value-loss.

Comparison

AlphaGo Zero AlphaZero othello-zero
ResNet blocks 20 or 40 20 10 
History features 8 8 4
Dirichlet noise α 0.03 Chess 0.3
Shogi 0.15
Go 0.03
0.5
Learning rate -->400K 10^-2
-->600K 10^-3
600K--> 10^-4
-->400K 10^-2
-->600K 10^-3
600K--> 10^-4
10^-2
Transform the board position during MCTS Yes No No
Virtual loss Yes Yes No
Model that generates new games The best The latest The latest
The replacement of old games for training Smooth Smooth Complete

Presented by doBell. We are so proud of this.
来自 doBell 团队,我们为此倍感荣光。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.