Coder Social home page Coder Social logo

hany606 / dqn-pytorch Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 1.0 1001 KB

This repository contains DQN implementation using PyTorch. This repository was originally for solving a task for JetBrains Research Lab summer 2021 internship.

Python 100.00%

dqn-pytorch's Introduction

DQN-PyTorch

This repository was mainly for solving a task for JetBrains Research Lab summer 2021 internship. The task's description is: "Implement DQN, policy gradient or actor-critic RL algorith to solve Mountain-Car gym environment"

Details and Explanation

I have implemented (Simple/Vanilla) Deep Q-Network (DQN) algorithm with experience replay buffer and frequent change for the target network inside "DQN.py".

A gif for a trained agent by this implementation of DQN

Trained Agent

After testing with the original reward of the environment, nothing was improved in the training. So, I have changed the reward function, to test different behavior and see some improvements.

Multiple reward functions have been tested to conform with the desired behaviors:

  • Move with fast right and left -> Correlated with velocity [2nd observation]

  • Move closer to the goal -> Correlated with the position [1st observation]

I have noticed some observations:

  • When only the position is in the reward (or the position dominated) it makes it only try to go up not by going right and left but just go right

  • When only the velocity is in the reward (or the velocity dominated) it makes it only to move fast right and left and don't care about the real goal (position)

And for that I have made a new reward function:

reward= r + abs(velocity)*10 - abs(position-0.5))

such that r is the original reward (0 or -1) from the environment, 0.5 in the equation is the desired position for the car. And the weight scalar factor (10) was tuned with experiments in such a way if it is too much the car will only be interested in gaining velocity and not reach the desired position and it is too small, the car (agent) will be interested in be closer to the goal but not in gaining veocity first to swing in order to reach the top

How to use?

  • start training script:

        python3 learn.py
  • start trained agent:

        python3 run_agent.py

Plot for rewards during the training

Reward training plot for 500 epochs:

Reward plot

Reward testing plot for 100 epochs:

Reward plot

References:

TODO (Later):

  • Implement Vanilla DQN for value-based RL algorithm

  • Write good README.md with cool gifs

  • Implement REINFORCE for Policy Gradient (Maybe in different repo)

  • Implement simple Actor-Critic algorithm (Maybe in different repo)

  • Add more plots with more experiments with different seeds

  • Perform different experiments

  • Add trained agents and videos directories

  • Add plots for different results with different algorithms

  • Use RLlib to show the difference between the implementd and the library's implementation and provide plots

  • Create a report with references and papers

dqn-pytorch's People

Contributors

hany606 avatar

Stargazers

Yoon, Seungje avatar

Watchers

 avatar

Forkers

ekdhsla1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.