Light

afcarl / tensorflow-a3c Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mrahtz/ocd-a3c

0.0 1.0 0.0 274.45 MB

TensorFlow implementation of A3C

Python 98.75% Shell 1.25%

tensorflow-a3c's Introduction

TensorFlow A3C

summaries from worker 0

Side project doing a TensorFlow implementation of A3C.

Started as part of a reproduction of Deep Reinforcement Learning from Human Preferences, but currently on hold, in favour of OpenAI's A2C implementation instead.

In progress; ugly code.

Milestones

19/07/2017: Implemented A3C training operations
28/07/2017: Got Distributed TensorFlow working (see the followup, Distributed TensorFlow: A Gentle Introduction)
08/08/2017: Implemented all preprocessing stages
18/08/2017: Functioning with a single worker
20/08/2017: Functioning with multiple workers
23/08/2017: Implemented action entropy bonus
30/08/2017: Implemented visualisation of value network output, for a sanity check

Usage

run.py starts a single worker. One worker should be started with a worker_n of 0; this worker holds the graph.
run.sh is a wrapper which starts 16 workers.

Unsolved questions/todos

Memory usage is higher than it seems like it should be.
Currently gradients are accumulated over an entire episode rather than only 5 time steps as in the paper. With 5 time steps, it doesn't work.
Based on a cursory comparison, OpenAI's A2C implementation seems to run faster.
Currently Adam is used, whereas the paper uses RMSProp. If RMSProp is used instead of Adam, it doesn't work.
Shared optimiser statistics currently isn't implemented.
It doesn't seem to learn as fast results quoted in the paper. With 16 workers, it takes about 6 hours to reach full reward (git c693e72).

The paper, on the other hand, reaches maximum reward within about 2 hours.

.

tensorflow-a3c's People

Contributors

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.