Coder Social home page Coder Social logo

felipemarcelino / 2048-gym Goto Github PK

View Code? Open in Web Editor NEW
81.0 3.0 14.0 526.89 MB

This projects aims to use reinforcement learning algorithms to play the game 2048.

Python 100.00%
reinforcement-learning openai-gym ddqn gym-environment 2048 artificial-intelligence machine-learning

2048-gym's Introduction

2048-Gym

Agent playing

This repository is a project about using DQN(Q-Learning) to play the Game 2048 and accelarate and accelerate the environment using Numba). The algorithm used is from Stable Baselines, and the environment is a custom Open AI env. The environment contains two types of representation for the board: binary and no binary. The first one uses a power two matrix to represent each tile of the board. On the contrary, no binary uses a raw matrix board.

The model uses two different types of neural networks: CNN(Convolutional Neural Network), MLP(Multi-Layer Perceptron). The agent performed better using CNN as an extractor for features than MLP. Probably it is because CNN can extract spatial features. As a result, the agent achieve a 2048 tile in 10% of the 1000 played games.

Optuna

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API. Thanks to our define-by-run API, the code written with Optuna enjoys high modularity, and the user of Optuna can dynamically construct the search spaces for the hyperparameters.

There is a guide of how to use this library here.

Numba

Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code.

There is a guide of how to use this library here.

Instalation

Installing dependecies pip install -r [requirements_cpu.txt|requirements-gpu.txt], choosing the appropriate file depending on whether you wish to run the models on a CPU or a GPU.

OR

Using conda environment

conda env create -f [conda_env_gpu.yml|conda_env_cpu.yml]

To install the environment, execute the following commands:

git clone https://github.com/FelipeMarcelino/2048-Gym/
cd 2048-gym/gym-game2048/
pip install -e .

Running

usage: model_optimize.py [-h] --agent AGENT
                         [--tensorboard-log TENSORBOARD_LOG]
                         [--study-name STUDY_NAME] [--trials TRIALS]
                         [--n-timesteps N_STEPS] [--save-freq SAVE_FREQ]
                         [--save-dir SAVE_DIR] [--log-interval LOG_INTERVAL]
                         [--no-binary] [--seed SEED]
                         [--eval-episodes EVAL_EPISODES]
                         [--extractor EXTRACTOR] [--layer-normalization]
                         [--num-cpus NUM_CPUS] [--layers LAYERS [LAYERS ...]]
                         [--penalty PENALTY] [--load_path LOAD_PATH]
                         [--num_timesteps_log NUM_TIMESTEPS_LOG]

optional arguments:
  -h, --help            show this help message and exit
  --agent AGENT, -ag AGENT
                        Algorithm to use to train the model - DQN, ACER, PPO2
  --tensorboard-log TENSORBOARD_LOG, -tl TENSORBOARD_LOG
                        Tensorboard log directory
  --study-name STUDY_NAME, -sn STUDY_NAME
                        The name of study used for optuna to create the
                        database.
  --trials TRIALS, -tr TRIALS
                        The number of trials tested for optuna optimize. - 0
                        is the default setting and try until the script is
                        finish
  --n-timesteps N_STEPS, -nt N_STEPS
                        Number of timestems the model going to run.
  --save-freq SAVE_FREQ, -sf SAVE_FREQ
                        The interval between model saves.
  --save-dir SAVE_DIR, -sd SAVE_DIR
                        Save dictory models
  --log-interval LOG_INTERVAL, -li LOG_INTERVAL
                        Log interval
  --no-binary, -bi      Do not use binary observation space
  --seed SEED           Seed
  --eval-episodes EVAL_EPISODES, -ee EVAL_EPISODES
                        The number of episodes to test after training the
                        model
  --extractor EXTRACTOR, -ex EXTRACTOR
                        The extractor used to create the features from
                        observation space - (mlp or cnn)
  --layer-normalization, -ln
                        Use layer normalization - Only for DQN
  --num-cpus NUM_CPUS, -nc NUM_CPUS
                        Number of cpus to use. DQN only accept 1
  --layers LAYERS [LAYERS ...], -l LAYERS [LAYERS ...]
                        List of neurons to use in DQN algorithm. The number of
                        elements inside list going to be the number of layers.
  --penalty PENALTY, -pe PENALTY
                        How much penalize the model when choose a invalid
                        action
  --load_path LOAD_PATH, -lp LOAD_PATH
                        Load model from
  --num_timesteps_log NUM_TIMESTEPS_LOG, -ntl NUM_TIMESTEPS_LOG
                        Continuing timesteps for tensorboard_log

Playing

Play the game using trained agent.

python play_game.py 

OBS: It is necessary to change the model path and agent inside play_game.py

Visualization

See best model actions using Tkinter.

python show_played_game.py

OBS: It is necessary to change the pickle game data inside show_played_game.py

2048-gym's People

Contributors

felipemarcelino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

2048-gym's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.