kuz / deepmind-atari-deep-q-learner Goto Github PK
View Code? Open in Web Editor NEWThe original code from the DeepMind article + my tweaks
Home Page: http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html
The original code from the DeepMind article + my tweaks
Home Page: http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html
can I train on GPU and test on CPU?
Hi.
Ive been having some trouble getting this running.
Here's what Ive done so far:
I have run the install_dependencies file but the torch folder is empty:
I then tried running the run_cpu and run_gpu files in terminal. Like so:
I dont understand what "copy the rom file in ROM's" means. If you could clarify this step or identify whats preventing me from running the software that would be great!
Thanks!
I finally succeeded to run in mac (El Capitan).
network:forward(s2):float():max(2)
some one can explain this function to me?
the input s2 means the state ,but what does the '2' in max(2) mean?
and dose the network:forward(s2):float():min(2) exist?
I cloned the repository and ran in my ubuntu 14.04. I installed every dependencies and when I run the run_cpu file, the os crashes. What can be the problem?
I used the source and run the program, success
But when I download a new rom file from "http://atariage.com/company_page.html?CompanyID=1&SystemID=2600&SystemFilterID=2600"
and run ,for example "3d_tic" rom,it failed.
the error output is like below:
./run_cpu 3d_tic
-framework alewrap -game_path /home/zed/projects/google-dqn/DeepMind-Atari-Deep-Q-Learner-show-version/roms/ -name DQN3_0_1_3d_tic_FULL_Y -env 3d_tic -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -steps 50000000 -eval_freq 250000 -eval_steps 125000 -prog_freq 5000 -save_freq 125000 -actrep 4 -gpu -1 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4
Torch Threads: 4
Using CPU code only. GPU device id: -1
Torch Seed: 1
./run_cpu: line 46: 16364 Segmentation fault ../torch/bin/qlua train_agent.lua $args
=> Torch7 has been installed successfully
Installing nngraph ...
install_dependencies.sh: line 98: /media/envy/data1t/os_prj/github/_deepmind/DeepMind-Atari-Deep-Q-Learner/torch/bin/luarocks: Permission denied
Error. Exiting.
when I run this command:
sudo ./test_gpu breakout DQN3_0_1_breakout_FULL_Y.t7
the output is:
-framework alewrap -game_path /home/omen/Projects/DeepLearning/Test/DeepMind-Atari-Deep-Q-Learner/roms/ -name DQN3_0_1_breakout_FULL_Y -env breakout -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -actrep 4 -gpu 0 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4 -network DQN3_0_1_breakout_FULL_Y.t7 -gif_file ../gifs/breakout.gif
Torch Threads: 4
Using GPU device id: 0
Torch Seed: 1
CUTorch Seed: 1791095845
Playing: breakout
qlua: ./NeuralQLearner.lua:79: Could not find network file
stack traceback:
[C]: at 0x7f29901f7990
[C]: in function 'error'
./NeuralQLearner.lua:79: in function '__init'
...-Atari-Deep-Q-Learner/torch/share/lua/5.1/torch/init.lua:91: in function <...-Atari-Deep-Q-Learner/torch/share/lua/5.1/torch/init.lua:87>
[C]: at 0x7f2986461710
./initenv.lua:133: in function 'setup'
test_agent.lua:46: in main chunk
I have trained the model and the network file exists in "dqn" directory. can somebody tell me what is wrong?
Update:
I think the problem was about the network file. The training process should be completed before using it.
Since it costs a long time training (30hours on GPU) to get the smart model, would you please provide an already trained model file that I can run without training from beginning?
I only have a CPU laptop so it can't afford such a long time training.
Thanks a lot in advance!
if you want it to work with the latest cunn package, you might want to patch this commit:
soumith/deepmind-atari@88fffea
Can someone help me with this error:
../torch/bin/luajit: /home/cwj/torch/install/share/lua/5.1/alewrap/aleffi.lua:45: /home/cwj/RL/atari/torch/lib/libxitari.so: undefined symbol: ale_act
stack traceback:
[C]: in function '__index'
/home/cwj/torch/install/share/lua/5.1/alewrap/aleffi.lua:45: in main chunk
[C]: in function 'dofile'
/home/cwj/torch/install/share/lua/5.1/alewrap/init.lua:21: in main chunk
[C]: in function 'require'
./initenv.lua:115: in function 'setup'
train_agent.lua:52: in main chunk
[C]: at 0x00406260
I am trying to running Deep Q Learn, but when i am running about 6 hours it doest seem to get smarter. I wonder how long you run the AI until its get experts to play breakout game?
Hi @kuz ,
I'm using UBuntu 17.04 and while installing the dependencies, the following error message is observed:
`/bin/sh: 1: gdlib-config: not found
/bin/sh: 1: gdlib-config: not found
luagd.c:2171:33: error: ‘LgdImageCreateFromPng’ undeclared here (not in a function)
{ "createFromPng", LgdImageCreateFromPng },
^~~~~~~~~~~~~~~~~~~~~
luagd.c:2172:33: error: ‘LgdImageCreateFromPngPtr’ undeclared here (not in a function)
{ "createFromPngStr", LgdImageCreateFromPngPtr },
^~~~~~~~~~~~~~~~~~~~~~~~
Makefile:104: recipe for target 'gd.lo' failed
make: *** [gd.lo] Error 1
Error: Build error: Failed building.
Error. Exiting.`
I could get rid of the missing gdlib-config using pkg-config, however the 'LgdImageCreateFromPng' and 'LgdImageCreateFromPngPtr' errors remain. Do you have any idea what have I missed out?
envy@ub1404envy:/os_prj/github/_deepmind/DeepMind-Atari-Deep-Q-Learner$ ./run_gpu roms/breakout.bin/os_prj/github/_deepmind/DeepMind-Atari-Deep-Q-Learner$ vim install_dependencies.sh
-framework alewrap -game_path /home/envy/os_prj/github/_deepmind/DeepMind-Atari-Deep-Q-Learner/roms/ -name DQN3_0_1_roms/breakout.bin_FULL_Y -env roms/breakout.bin -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -steps 50000000 -eval_freq 250000 -eval_steps 125000 -prog_freq 10000 -save_freq 125000 -actrep 4 -gpu 0 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4
Torch Threads: 4
Using GPU device id: 0
Torch Seed: 1
CUTorch Seed: 1791095845
qlua: ./initenv.lua:115: module 'alewrap' not found:
no field package.preload['alewrap']
no file '/home/envy/.luarocks/share/lua/5.1/alewrap.lua'
no file '/home/envy/.luarocks/share/lua/5.1/alewrap/init.lua'
no file '/home/envy/torch/install/share/lua/5.1/alewrap.lua'
no file '/home/envy/torch/install/share/lua/5.1/alewrap/init.lua'
no file './alewrap.lua'
no file '/home/envy/torch/install/share/luajit-2.1.0-beta1/alewrap.lua'
no file '/usr/local/share/lua/5.1/alewrap.lua'
no file '/usr/local/share/lua/5.1/alewrap/init.lua'
no file '/home/envy/torch/install/lib/alewrap.so'
no file '/home/envy/.luarocks/lib/lua/5.1/alewrap.so'
no file '/home/envy/torch/install/lib/lua/5.1/alewrap.so'
no file './alewrap.so'
no file '/usr/local/lib/lua/5.1/alewrap.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
[C]: at 0x7f57114a59f0
[C]: in function 'require'
./initenv.lua:115: in function 'setup'
train_agent.lua:52: in main chunk
envy@ub1404envy:
envy@ub1404envy:~/os_prj/github/_deepmind/DeepMind-Atari-Deep-Q-Learner$ which nvcc
/usr/local/cuda-7.5/bin/nvcc
also try the run_cpu ,same issue.
In the DQN loss, the update should only happen for the observed action. Assume a mini-batch of (s, a, r, s2)
with m
samples in the mini-batch (therefore, s
and s2
would be m x n
if n
is the nb of features and so on). Then, for sample j
in the mini-batch, only a[j]
should contribute to the loss, which says that all the output elements of the Q-network corresponding to the target[j][:]
except target[j][a[j]]
should be masked.
Here, the target values (except target[j][a[j]]
) are set to zero instead of masking network's output, which means the target value of zero is used and the Q-network is trained toward value zero
for all other actions than a[j]
, when we are at the state s[j, :]
.
Am I missing something?
EDIT:
It is OK (though a bit confusing). The difference has actually been put in the target and not the target itself.
Hi, as far as I know, DQN is able to play very well on Pacman, it says that the clip function should be modified in order to make the algorithm being able to tell difference between high score and low score. Any one knows how to do that properly? I tried to rescale the reward linearly by multiplying a const factor so that the result is in range between 0 and 1, but that does not work.
I have modified the run_gpu code and add argument network, and I'm sure the network is loaded from file (I add a print at loading place), but the result is just like from scratch.
However, using the test_gpu, I'm sure the snapshot does have a trained network.
Does anyone have some suggestion?
I have spent several hours training a model, now I want to continue training, what should I do? Please give me some tips.
I have studied source code. But I dont understand how and which lua file used to import image files. As per module architecture for this source code 4 images are used as input.for CONV1 layer. But from where source code import.imges. I can successfully run goven source code on gpu/cpu. But i couden't understand some implementation stratergy. kindly help me out. Thank you.
after ./run_cpu , game start,
How to stop training and continue the train data next time?
How to find the trained network file?
When I try to run a game on the GPU, I get the following output:
$ ./run_gpu breakout
-framework alewrap -game_path /home/ben/Downloads/DeepMind-Atari-Deep-Q-Learner-master/roms/ -name DQN3_0_1_breakout_FULL_Y -env breakout -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -steps 50000000 -eval_freq 250000 -eval_steps 125000 -prog_freq 10000 -save_freq 125000 -actrep 4 -gpu 0 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4
Fontconfig error: "/home/ben/.config/font-manager/local.conf", line 2: syntax error
Gtk-Message: Failed to load module "canberra-gtk-module"
Torch Threads: 4
qlua: ./initenv.lua:58: module 'cutorch' not found:
no field package.preload['cutorch']
no file './cutorch.lua'
no file '/home/ben/Downloads/DeepMind-Atari-Deep-Q-Learner-master/torch/share/luajit-2.0.3/cutorch.lua'
no file '/usr/local/share/lua/5.1/cutorch.lua'
no file '/usr/local/share/lua/5.1/cutorch/init.lua'
no file '/home/ben/Downloads/DeepMind-Atari-Deep-Q-Learner-master/torch/share/lua/5.1/cutorch.lua'
no file '/home/ben/Downloads/DeepMind-Atari-Deep-Q-Learner-master/torch/share/lua/5.1/cutorch/init.lua'
no file './cutorch.so'
no file '/usr/local/lib/lua/5.1/cutorch.so'
no file '/home/ben/Downloads/DeepMind-Atari-Deep-Q-Learner-master/torch/lib/lua/5.1/cutorch.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
[C]: at 0x7f0b12f8d7b0
[C]: in function 'require'
./initenv.lua:58: in function 'torchSetup'
./initenv.lua:112: in function 'setup'
train_agent.lua:52: in main chunk
Is there a way to resume training from a snapshot
Excuse me. I installed it in "Ubuntu 14.04.3 LTS". When I run "./run_cpu breakout", it shows the following error. Would you please help me on this? Thanks a lot in advance!
-framework alewrap -game_path /home/gaoteng/DeepmindAtari/DeepMind-Atari-Deep-Q-Learner-master/roms/ -name DQN3_0_1_breakout_FULL_Y -env breakout -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -steps 50000000 -eval_freq 250000 -eval_steps 125000 -prog_freq 5000 -save_freq 125000 -actrep 4 -gpu -1 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4
Torch Threads: 4
Using CPU code only. GPU device id: -1
Torch Seed: 1
Playing: breakout
Creating Agent Network from convnet_atari3
nn.Sequential {
input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> output: nn.Reshape(4x84x84)
(2): nn.SpatialConvolution(4 -> 32, 8x8, 4,4, 1,1)
(3): nn.Rectifier
(4): nn.SpatialConvolution(32 -> 64, 4x4, 2,2)
(5): nn.Rectifier
(6): nn.SpatialConvolution(64 -> 64, 3x3)
(7): nn.Rectifier
(8): nn.Reshape(3136)
(9): nn.Linear(3136 -> 512)
(10): nn.Rectifier
(11): nn.Linear(512 -> 4)
}
Convolutional layers flattened output size: 3136
./run_cpu: line 46: 2372 Segmentation fault (core dumped) ../torch/bin/qlua train_agent.lua $args
Is there any install method for mac ?
I have installed of torch, lua, luarocks in mac os x.
But, I cannot install this.
How can ?
Thank you, in advances~
I want to make my own ale models for DQN Reinforce Learning using my video file input.
In video images, there are class labels in the corner for supervised training.
How can I make my own ALE model?
Thank you in advances.
Hi kuz,
Thank for great code for DQN.
It's very helpful to study deeply for DQN.
I'm trying to run this code in docker environment but it failed after returns below messages.
I think it happens because docker is not supporting qt library.
Is that possible to turn off to display process on screen and just save images such as gif?
root@b04d950e403d:~/DeepLearning/DeepMind-Atari-Deep-Q-Learner# ./run_gpu breakout
-framework alewrap -game_path /root/DeepLearning/DeepMind-Atari-Deep-Q-Learner/roms/ -name DQN3_0_1_breakout_FULL_Y -env breakout -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -steps 50000000 -eval_freq 250000 -eval_steps 125000 -prog_freq 10000 -save_freq 125000 -actrep 4 -gpu 0 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4
Unable to connect X11 server (continuing with -nographics)
Torch Threads: 4
Using GPU device id: 0
Torch Seed: 1
CUTorch Seed: 1791095845
Playing: breakout
Creating Agent Network from convnet_atari3
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> output]
(1): nn.Reshape(4x84x84)
(2): nn.SpatialConvolution(4 -> 32, 8x8, 4,4, 1,1)
(3): nn.Rectifier
(4): nn.SpatialConvolution(32 -> 64, 4x4, 2,2)
(5): nn.Rectifier
(6): nn.SpatialConvolution(64 -> 64, 3x3)
(7): nn.Rectifier
(8): nn.Reshape(3136)
(9): nn.Linear(3136 -> 512)
(10): nn.Rectifier
(11): nn.Linear(512 -> 4)
}
Convolutional layers flattened output size: 3136
Set up Torch using these options:
eval_steps 125000
seed 1
name DQN3_0_1_breakout_FULL_Y
verbose 2
network
pool_frms table: 0x413352f8
saveNetworkParams false
gpu 1
eval_freq 250000
tensorType torch.FloatTensor
env_params table: 0x41335320
steps 50000000
prog_freq 10000
agent_params table: 0x41335230
save_versions 0
framework alewrap
agent NeuralQLearner
threads 4
actrep 4
random_starts 30
game_path /root/DeepLearning/DeepMind-Atari-Deep-Q-Learner/roms/
save_freq 125000
env breakout
Iteration .. 0
qlua: not loading module qtgui (running with -nographics)
qlua: qtwidget window functions will not be usable (running with -nographics)
qtwidget window functions will not be usable (running with -nographics)
qlua: not loading module qtuiloader (running with -nographics)
qlua: /root/torch/install/share/lua/5.1/image/init.lua:1448: attempt to index global 'qtuiloader' (a nil value)
stack traceback:
[C]: in function '__index'
/root/torch/install/share/lua/5.1/image/init.lua:1448: in function 'window'
/root/torch/install/share/lua/5.1/image/init.lua:1399: in function 'display'
train_agent.lua:98: in main chunk
root@b04d950e403d:~/DeepLearning/DeepMind-Atari-Deep-Q-Learner#
After installation, I tried to run breakout game by cpu mode.
sudo ./run_cpu breakout, error appear:
Updating manifest for /home/ubuntu/DeepMind/torch/lib/luarocks/rocks
luagd 2.0.33r3-1 is now built and installed in /home/ubuntu/DeepMind/torch/ (license: MIT/X11)
Lua-GD installation completed
You can run experiments by executing:
./run_cpu game_name
or
./run_gpu game_name
For this you need to provide the rom files of the respective games (game_name.bin) in the roms/ directory
ubuntu@tegra-ubuntu:~/DeepMind$ sudo ./run_cpu breakout
-framework alewrap -game_path /home/ubuntu/DeepMind/roms/ -name DQN3_0_1_breakout_FULL_Y -env breakout -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -steps 50000000 -eval_freq 250000 -eval_steps 125000 -prog_freq 5000 -save_freq 125000 -actrep 4 -gpu -1 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4
Torch Threads: 4
Using CPU code only. GPU device id: -1
Torch Seed: 1
Playing: breakout
Creating Agent Network from convnet_atari3
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> output]
(1): nn.Reshape(4x84x84)
(2): nn.SpatialConvolution(4 -> 32, 8x8, 4,4, 1,1)
(3): nn.Rectifier
(4): nn.SpatialConvolution(32 -> 64, 4x4, 2,2)
(5): nn.Rectifier
(6): nn.SpatialConvolution(64 -> 64, 3x3)
(7): nn.Rectifier
(8): nn.Reshape(3136)
(9): nn.Linear(3136 -> 512)
(10): nn.Rectifier
(11): nn.Linear(512 -> 4)
}
Convolutional layers flattened output size: 3136
./run_cpu: line 46: 23755 Segmentation fault ../torch/bin/qlua train_agent.lua $args
on issue #35
You need to change the line 127 in install_dependencies.sh:
change it to:
make GDFEATURES="-DGD_PNG -DGD_GIF -DGD_JPEG -DGD_XPM -DGD_FREETYPE -DGD_FONTCONFIG"
add:
cp gd.so $PREFIX/lib/lua/5.1/
everything will work.
In the nature paper Deepmind said they used mean square error as loss function, but I didn't find it in these codes, no criterion definition. Is there any default criterion in Torch7 ? I just confused.
I train the DQN and get a trained model successfully. But when I test the model using the test_gpu.lua, I get the followed error:
qlua: symbol lookup error: /home/xx/ torch/install/bin/../lib/libqlua.so: undefined symbol : _ZH6QMutex12lockInternalEv.
@kuz Can you give me some advice? Thank you!
Hi. I've got access to a gpu farm that's running on Redhat. Are there any suggestions for installing this on different distros? I'm just not sure where to start is all. Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.