Coder Social home page Coder Social logo

udacity_multiplayer's Introduction

Udacity_multiplayer

Submission for completing the Udacity Project

Implementation

The agent is DDPG (Deep Deterministic Policy Gradients) with the following upgrades.

  • Priority Replay Buffer (With Priority Tree)
  • Soft updates

Contains the weights of the trained RL bot to solve the problem. Graphs indicating the progress of the agent and when it solved the problem.

The DDPG agent solved the enviroment in 1450 (fastest solution) episodes (Average Reward over the last 100 steps > 0.5). Which took 20 minutes of actual training time. And a maximum reward of 2.8

I let it train until mean reward > 0.7 for the following graph

Graph

There are two Environments:

Tennis

  • State space = 24
  • Action space = 2 (continuous)

In this environment, two agents control rackets to bounce a ball over a net. If an agent hits the ball over the net, it receives a reward of +0.1. If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01. Thus, the goal of each agent is to keep the ball in play.

The environment is considered solved, when the average (over 100 episodes) of those scores is at least +0.5.

Soccer

Agent Reward Function (dependent):

Striker: +1 When ball enters opponent's goal. -0.1 When ball enters own team's goal. -0.001 Existential penalty. Goalie: -1 When ball enters team's goal. +0.1 When ball enters opponents goal. +0.001 Existential bonus.

Brains: Two Brain with the following observation/action space:

Vector Observation space: 112 corresponding to local 14 ray casts, each detecting 7 possible object types, along with the object's distance. Perception is in 180 degree view from front of agent. Vector Action space: (Discrete) One Branch Striker: 6 actions corresponding to forward, backward, sideways movement, as well as rotation. Goalie: 4 actions corresponding to forward, backward, sideways movement. Visual Observations: None.

Reset Parameters: None Benchmark Mean Reward (Striker & Goalie Brain): 0 (the means will be inverse of each other and criss crosses during training)


Project Layout

Agents

DDPG, (Works) (MADDPG, PPO, In process of implementation)

Buffers

Vanilla ReplayBuffer, Priority Experience Replay

Utils

contains noise for ddpg, plotting, ddpg agent configuration file, unity_env wrapper.

DDPG Agent weights

DDPG/model_weights/actor DDPG/model_weights/critic

Installation

Clone the repository.

git clone [email protected]:MorGriffiths/Udacity_Navigation.git
cd Udacity_Navigation

install anaconda

install the anaconda environment from the conda_requirements.txt file

conda create --name Tennis --file conda_requirements.txt

depending on which version of anaconda you have

conda activate Tennis

or

source activate Tennis

Install Unity ml-agents.

git clone https://github.com/Unity-Technologies/ml-agents.git
git -C ml-agents checkout 0.4.0b
pip install ml-agents/python/.

Install the project requirements.

pip install -r requirements.txt

Download the Tennis Environment which matches your operating system

Download the Soccer Unity Environment

Place the environment into the Environments folder. If necessary, inside main.py, change the path to the unity environment appropriately

Run the project

Make sure the environment path is correctly set in main.py and run

cd DDPG
python main.py

Futher details

See Tennis_report.md along with the performance graph and the weights.

udacity_multiplayer's People

Contributors

morgan-griffiths avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.