Coder Social home page Coder Social logo

tensorflow-rl's Introduction

Tensorflow-RL

Join the chat at https://gitter.im/tensorflow-rl/Lobby

Tensorflow based implementations of A3C, PGQ, TRPO, DQN+CTS, and CEM originally based on the A3C implementation from https://github.com/traai/async-deep-rl. I extensively refactored most of the code and beyond the new algorithms added several additional options including the a3c-lstm architecture, a fully-connected architecture to allow training on non-image-based gym environments, and support for continuous action spaces.

The code also includes some experimental ideas I'm toying with and I'm planning on adding the following implementations in the near future:

*currently in progress

Notes

  • You can find a number of my evaluations for the A3C, TRPO, and DQN+CTS algorithms at https://gym.openai.com/users/steveKapturowski. As I'm working on lots of refactoring at the moment it's possible I could break things. Please open an issue if you discover any bugs.
  • I'm in the process of swapping out most of the multiprocessing code in favour of distributed tensorflow which should simplify a lot of the training code and allow to distribute actor-learner processes across multiple machines.
  • There's also an implementation of the A3C+ model from Unifying Count-Based Exploration and Intrinsic Motivation but I've been focusing on improvements to the DQN variant so this hasn't gotten much love

Running the code

First you'll need to install the cython extensions needed for the hog updates and CTS density model:

./setup.py install build_ext --inplace

To train an a3c agent on Pong run:

python main.py Pong-v0 --alg_type a3c -n 8

To evaluate a trained agent simply add the --test flag:

python main.py Pong-v0 --alg_type a3c -n 1 --test --restore_checkpoint

DQN+CTS after 80M agent steps using 16 actor-learner threads

Montezuma's Revenge

A3C run on Pong-v0 with default parameters and frameskip sampled uniformly over 3-4

alt text

Requirements

  • python 2.7
  • tensorflow 1.2
  • scikit-image
  • Cython
  • pyaml
  • gym

tensorflow-rl's People

Contributors

stevekapturowski avatar arjunchandra avatar jimmcmahon avatar gitter-badger avatar

Watchers

James Cloos avatar Siddharth Varia avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.