hungtuchen / pytorch-dqn Goto Github PK

Deep Q-Learning Network in pytorch (not actively maintained)

License: MIT License

Python 100.00%

pytorch deep-q-network deep-reinforcement-learning

pytorch-dqn's Introduction

pytoch-dqn

This project is pytorch implementation of Human-level control through deep reinforcement learning and I also plan to implement the following ones:

Credit

This project reuses most of the code in https://github.com/berkeleydeeprlcourse/homework/tree/master/hw3

Requirements

python 3.5
gym (built from source)
pytorch (built from source)

Usage

To train a model:

$ python main.py

# To train the model using ram not raw images, helpful for testing

$ python ram.py

The model is defined in dqn_model.py

The algorithm is defined in dqn_learn.py

The running script and hyper-parameters are defined in main.py

pytorch-dqn's People

Stargazers

Watchers

Forkers

benjamesbabala awesome-ml chingyaoc nkcr7 jeffhernandez1995 ajaytalati cocomoff ddrise leehomyc tegg89 meelement ssarcandy laughing-boy cpehle guoguo12 grseb9s praveen-palanisamy colllin blackhc jianfly selvamarul severusvinegar amerinoo shubhampachori12110095 kylinliu chunningdu yuhangsong ericsqxd bhargav104 zhuzhenping wangjianyuweg paolominguzzi afcarl lilac-lee dlk1932 yanxiaobin-ben zueigung1419 leeyangg dengqiansheng ocean1211 leopoldoimagallanesc sitinggz linwei-chen liyaangy gunshi wzwtime abhimanyudubey airopti dehaozhang decastro-alex singhshraddha aeroxi github-hongweizhang mengmeng96 raven65 bitantiga qiufengsly kite-hz ageliss zblhero haha-533 paulestano subhash-pal brithbutter yw2224 tobyge chensh236 xrosliang shiyuzh2007 shwetasrsh anic46 know-nothing8 berooo shen-tianji-gn wukaiqi chenky9106 n0whereruoxi matants bhargavaram1997 shendiaomo priyakot tzq2doc markcens messorem7 hardlygo ava4wonder ethanabrooks liuyuxi321 abdelghafour69 cade-w prettywork2021 garrett-partenza-us mathbloodprince 2644556969 sariteam wlsyhlb whoismanoj wyqsss dbdsir oneenooo

pytorch-dqn's Issues

Any updates?

Are there any plans to update recent SOTA algorithms?

typo: 15 lines in main.py (TARGER_UPDATE_FREQ = 10000 --> TARGET_UPDATE_FREQ = 10000)

It's trivial though.

Does the project supports windows environment？If support, is there some aspect I need to be careful?Thanks for your great work

ImportError: No module named 'openai_benchmark'

ubuntu 16.04
python3.5
anaconda
installed gym refered https://github.com/openai/gym instruction.
but when i run your code

Traceback (most recent call last):
File "/home/op/pytorch-dqn-master/main.py", line 3, in
import openai_benchmark
ImportError: No module named 'openai_benchmark'

i found in the search engine,
someone runs dir(gym),the result contains 'benchmark_spec', 'benchmarks',
but none on my pc.

pytorch 0.2

I'm running pytorch 0.2,

and the code dqn_learn.py fail to work..

the error as follow

Traceback (most recent call last):
  File "ram.py", line 57, in <module>
    main(env)
  File "ram.py", line 46, in main
    target_update_freq=TARGER_UPDATE_FREQ,
  File "/auto/master05/ssarcandy/ttt/dqn_learn.py", line 213, in dqn_learing
    current_Q_values.backward(d_error.data.unsqueeze(1))
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/variable.py", line 156, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/__init__.py", line 98, in backward
    variables, grad_variables, retain_graph)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
    return self._forward_cls.backward(self, *args)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 566, in backward
    return grad_input.scatter_add_(ctx.dim, index, grad_output), None, None
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/variable.py", line 696, in scatter_add_
    return ScatterAdd.apply(self, dim, index, source, True)
  File "/home/master/05/ssarcandy/.local/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 605, in forward
    return input.scatter_add_(ctx.dim, index, source)
RuntimeError: invalid argument 3: Index tensor must have same dimensions as input tensor at /pytorch/torch/lib/THC/generic/THCTensorScatterGather.cu:198

Unmatching size and error

Hi, thanks for sharing your wonderful code.
But I have met some errors when running it.

Inside the line 197~205 from dqn_learn.py, the size of target_Q_values and that of current_Q_values does not matched well. I have changed to next_max_q = next_max_q.unsqueeze(-1) for correcting sizes. Also I have changed to rew_batch[0] from line 203.
(IMO) After stacking records in replay buffer, queue action does not work properly. I have changed the line 158 to action = select_epilson_greedy_action(Q, recent_observations, t), however different action value has queued.

I am still working these but having troubles. Could you help make them right?

ImportError: cannot import name wrappers

Hello I get this error, how to fix it?

I have installed gym successfully, but I have this error.
-Thank you-

train time

How much time did the training take you on games such as pong and breakout?

How do I test this ?

How do I test the model? Also what is the difference betweein ram.py and train.py ?

Hi, do you have the results of training with ram and training with image?

Thanks in adance.

I run the code,and found your dqn algorithm take such a long time to converge.Actually,I found few implementation of dqn can converge in github.They can converge in a afternoon.I use a piece of GTX1080Ti.It is appreciate that your implementation can converge.But your code take a day and a night to converge.I don't know why.