Coder Social home page Coder Social logo

masouduut94 / mcts-agent-python Goto Github PK

View Code? Open in Web Editor NEW
58.0 3.0 8.0 708 KB

Monte Carlo Tree Search (MCTS) is a method for finding optimal decisions in a given domain by taking random samples in the decision space and building a search tree accordingly. It has already had a profound impact on Artificial Intelligence (AI) approaches for domains that can be represented as trees of sequential decisions, particularly games and planning problems. In this project I used a board game called "HEX" as a platform to test different simulation strategies in MCTS field.

License: MIT License

Python 100.00%
mcts sequential-decisions decision-space markov-decision-processes monte-carlo-tree-search game-of-hex reinforcement-learning

mcts-agent-python's Introduction

Monte-Carlo-Tree-Search-Agent-for-the-Game-of-HEX

Demo:

Demo of MCTS General Game Player

Description

This code belongs to this paper ๐Ÿ”— IMPROVING MONTE CARLO TREE SEARCH BY COMBINING RAVE AND QUALITY-BASED REWARDS ALGORITHMS.

what is Monte Carlo Tree Search(MCTS)?

MONTE Carlo Tree Search (MCTS) is a method for finding optimal decisions in a given domain by taking random samples in the decision space and building a search tree according to the results. It has already had a profound impact on Artificial Intelligence (AI) approaches for domains that can be represented as trees of sequential decisions, particularly games and planning problems. In this project I used different simulation strategies to enhance the agent policy to explore the environment.

from ๐Ÿ”— A Survey of Monte Carlo Tree Search Methods

About contribution

Before you go through the details, I recommend you to get familiar with the framework reading these medium articles:

So if you are familiar with the whole concept of MCTS and UCT algorithm, you must know that in practice it suffers from sparse rewards. it takes so much time to warm up the tree with simple UCT algorithm. So in this case we first implemented the RAVE algorithm that helps warm up tree faster. then implemented several simulation strategy like Last Good Reply, PoolRAVE, Decisive Move and also UCB1-Tuned.

Then we applied quality based rewards in Quality-based Rewards for Monte-Carlo Tree Search Simulations which basically it asserts that we can apply discounted rewards by knowing the length of simulation and the maximum number of actions allowed to take in environment for each player (In some games, the game ends after limited number of moves. because there is no more movements).

After that we used HRAVE and GRAVE in the paper Comparison of rapid action value estimation variants for general game playing 2018 - Chiara F. Sironi; Mark H. M. Winands which basically states that we can use the global information of the game to guide the simulations. We also tested the leaf threading on UCT.

all of above algorithms are addressed below.

Original repo

  • MopyHex: Authored by Kenny Young here

Contributions to the original repo:

This project has a further optimized version in here which optimized by cython.

Researches have been done in Urmia University of Technology.

Authors:

  • Masoud Masoumi Moghadam (Me ๐Ÿ˜Ž)
  • Prof: Mohammad Pourmahmood Aghababa profile
  • Prof: Jamshid Bagherzadeh profile

What is monte carlo tree search anyway?

Requirements

  • OS: Windows and Ubuntu
  • tkinter
  • Numpy

To run it:

You can ๐Ÿƒ (run) program using this command:

python main.py

Also you can run tests for comparing two mcts-based algorithms against each other using the playtest.py.

๐Ÿ“• To know more about MCTS:

This one is highly recommended:

Algorithms used for boosting MCTS in this framework:

  • Upper Confidence Bounds (UCT)
  • UCB1-Tuned
  • Rapid Action Value Estimation (RAVE)
  • Decisive Move
  • Quality Based Rewards
  • Pool RAVE
  • Last Good Reply

References

  • [1] A Survey of Monte Carlo Tree Search Methods, Cameron B. Browne et al, 2012 Link to paper
  • [2] Generalized Rapid Action Value Estimation, Tristan Cazenave, 2017 Link to paper
  • [3] Comparison of rapid action value estimation variants for general game playing, C. Sironi, 2018 Link to paper
  • [4] Quality-based Rewards for Monte-Carlo Tree Search Simulations, 2014 Link to paper
  • [5] The Last-Good-Reply Policy for Monte-Carlo Go 2009 Link to paper
  • [6] On the Huge Benefit of Decisive Moves in Monte-Carlo Tree Search Algorithms, Fabien Teytaud, Olivier Teytaud, 2010

mcts-agent-python's People

Contributors

masouduut94 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mcts-agent-python's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.