Coder Social home page Coder Social logo

something-original / simple Goto Github PK

View Code? Open in Web Editor NEW

This project forked from thomas-schillaci/simple

0.0 0.0 0.0 892 KB

PyTorch implementation of SimPLe (Simulated Policy Learning) on the Atari 100k benchmark.

Home Page: https://arxiv.org/abs/1903.00374

License: Other

Python 100.00%

simple's Introduction

SimPLe PyTorch

PyTorch implementation of SimPLe (Simulated Policy Learning) on the Atari 100k benchmark.

Based on the paper Model-Based Reinforcement Learning for Atari.

World model predictions on freeway

SimPLe predicting 50 frames into the future from 4 initial frames on Freeway.

Installation

This program uses python 3.7, CUDA 10.2 if enabled, and was tested on Ubuntu 20.04.1.

Run the following command to install the dependencies:

pip install torch==1.7.0 gym==0.15.7 gym[atari] opencv-python==4.4.0.42 tqdm==4.49.0 numpy==1.16.4

git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

Install wandb (optional)

You can use wandb to track your experiments:

pip install wandb

To use wandb, pass the flag --use-wandb when running the program. See How to use for more details about flags.

How to use

CUDA is enabled by default, see the following section to disable it.

To run the program, run the following command from the folder containing the simple package:

python -m simple

Disable CUDA

To disable CUDA, pass the flag --device cpu to the command line. See the next section for more information about flags.

Flags

You can pass multiple flags to the command line, a summary is printed at launch time. The most useful flags are described in the following table:

Flag Value Default Description
--agents Any positive integer 16 The number of parallel environments to train the PPO agent on
--device Any string accepted by torch.device cuda Sets the PyTorch's device
--env-name Any game name (without the suffixes) as depicted here Freeway Sets the gym environment

The following boolean flags are set to False if not passed to the command line:

Flag Description
--render-evaluation Renders the environments during evaluation
--render-training Renders the environments during training
--use-wandb Enables wandb to track the experiment

For example, to run the program without CUDA and to render the environments during training, run:

python -m simple --device cpu --render-training

Per-environment performance

The scores* obtained with this implementation are detailed in the following table:

Environment Score Paper's score % of reported score in the original paper
Alien 558 405.2 137.7%
Freeway 22.1 20.3 108.9%
Kangaroo 640 323.1 198.1%
Kangaroo (deterministic) 466.7 481.9 96.8%
Krull 3418.2 4539.9 82.6%
MsPacman 681.3 762.8 89.3%

*Scores obtained on only one full training per environment. The scores are the maximum average cumulative reward obtained in the real environment.

simple's People

Contributors

thomas-schillaci avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.