Coder Social home page Coder Social logo

breakout-rl's Introduction

Applying RL to Breakout

Applying Reinforcement Learning to basic tasks has been quite a hot topic of interest in the last decade, especially for the second part.One of the basic steps is to begin implementing different algorithms related to it,to basic Games. Classic Arcade Game Enviornments have achieved a special attention towards themselves as a test bed for these kind of algorithms. My aim is to implement the algorithm(s) to make it/them play the game of Breakout.

Model(s) under Implementation :

โ€ข Asynchronous Advantage Actor Critic (A3C)

A3C (a basic intuition and guide for running)

It's hard to get your state of the art algorithm working,this is because getting any algorithm to work requires some good choices for hyperparameters, and I have to do all of these experiments over my lappy.

THE A3C algorithm can be essentially described as using policy gradients with a function approximator, where the function approximator is a deep neural network and the authors use a clever method to try and ensure the agent explores the state space well.Must admit I am in love with the idea.With the A3C algorithm,use many agents, all exploring the state space simultaneously. The hope is that the different agents will be in different parts of the state space, and thus give uncorrelated updates to the gradients.

For more better understanding you may refer to PDF , even I recieved concept and help from lot many places giving each of the link will not be possible,hope this can help you understand the Algorithm.

Library Requirements

tensorflow-gpu(1.14.0), numpy, threading, openCV, random, time, gym

The tensorflow version is no hard and fast restriction, you may use any version but will need to take care of the dependencies.

Running

First intialize the model_breakout_6.h5 and model_breakout_7.h5 files with small weights for the network given.The file model_breakout_7.h5 keep on updating per 50 episodes with better weights using the Entropy Policy. Uncomment the line 142 and 144 of the code, to see your agent play, learn and get better. Line 142 contains the condition such that you only observe one of the eight agents(workers) play.

Sample

alt-text

breakout-rl's People

Contributors

lazy-leopard avatar

Stargazers

Mukul Mohmare avatar Arghyadeep Das avatar Jyotika  avatar  avatar

Watchers

 avatar

Forkers

coder-raksh2509

breakout-rl's Issues

Unable to understand running procedure

"First initialize the model_breakout_6.h5 and model_breakout_7.h5 files with small weights for the network given."

Can you please mention how to do so? Would be grateful to you. If you can provide some source or provide the code to do so.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.