Coder Social home page Coder Social logo

pytorch-dqn's Introduction

pytoch-dqn

This project is pytorch implementation of Human-level control through deep reinforcement learning and I also plan to implement the following ones:

Credit

This project reuses most of the code in https://github.com/berkeleydeeprlcourse/homework/tree/master/hw3

Requirements

  • python 3.5
  • gym (built from source)
  • pytorch (built from source)

Usage

To train a model:

$ python main.py

# To train the model using ram not raw images, helpful for testing

$ python ram.py

The model is defined in dqn_model.py

The algorithm is defined in dqn_learn.py

The running script and hyper-parameters are defined in main.py

pytorch-dqn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-dqn's Issues

Any updates?

Are there any plans to update recent SOTA algorithms?

ImportError: No module named 'openai_benchmark'

ubuntu 16.04
python3.5
anaconda
installed gym refered https://github.com/openai/gym instruction.
but when i run your code

Traceback (most recent call last):
File "/home/op/pytorch-dqn-master/main.py", line 3, in
import openai_benchmark
ImportError: No module named 'openai_benchmark'

i found in the search engine,
someone runs dir(gym),the result contains 'benchmark_spec', 'benchmarks',
but none on my pc.

pytorch 0.2

I'm running pytorch 0.2,

and the code dqn_learn.py fail to work..

the error as follow

Traceback (most recent call last):
  File "ram.py", line 57, in <module>
    main(env)
  File "ram.py", line 46, in main
    target_update_freq=TARGER_UPDATE_FREQ,
  File "/auto/master05/ssarcandy/ttt/dqn_learn.py", line 213, in dqn_learing
    current_Q_values.backward(d_error.data.unsqueeze(1))
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/variable.py", line 156, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/__init__.py", line 98, in backward
    variables, grad_variables, retain_graph)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
    return self._forward_cls.backward(self, *args)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 566, in backward
    return grad_input.scatter_add_(ctx.dim, index, grad_output), None, None
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/variable.py", line 696, in scatter_add_
    return ScatterAdd.apply(self, dim, index, source, True)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 605, in forward
    return input.scatter_add_(ctx.dim, index, source)
RuntimeError: invalid argument 3: Index tensor must have same dimensions as input tensor at /pytorch/torch/lib/THC/generic/THCTensorScatterGather.cu:198

Unmatching size and error

Hi, thanks for sharing your wonderful code.
But I have met some errors when running it.

  1. Inside the line 197~205 from dqn_learn.py, the size of target_Q_values and that of current_Q_values does not matched well. I have changed to next_max_q = next_max_q.unsqueeze(-1) for correcting sizes. Also I have changed to rew_batch[0] from line 203.

  2. (IMO) After stacking records in replay buffer, queue action does not work properly. I have changed the line 158 to action = select_epilson_greedy_action(Q, recent_observations, t), however different action value has queued.

I am still working these but having troubles. Could you help make them right?

train time

How much time did the training take you on games such as pong and breakout?

How do I test this ?

How do I test the model? Also what is the difference betweein ram.py and train.py ?

Take a long time to converge

I run the code,and found your dqn algorithm take such a long time to converge.Actually,I found few implementation of dqn can converge in github.They can converge in a afternoon.I use a piece of GTX1080Ti.It is appreciate that your implementation can converge.But your code take a day and a night to converge.I don't know why.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.