Hello. I am not very familiar with reinforcement learning. python3 m

which options about gym-city HOT 24 CLOSED

Tsunehiko commented on July 2, 2024

which options

from gym-city.

Comments (24)

Tsunehiko commented on July 2, 2024 1

I tried without --render. Results are now displayed! However, no matter how much learning is done, reward will not increase ...

from gym-city.

smearle commented on July 2, 2024

Hi Tsunehiko, thanks for your interest.
Apologies, the Readme is outdated. Try this:

python3 main.py --experiment test000 --model FullyConv --num-process 24 --map-width 16 --render --overwrite --algo a2c

Please have a look at arguments.py or call python3 main.py --help to see a full list of arguments available.

from gym-city.

Tsunehiko commented on July 2, 2024

Thank you for your quick response.
I want to try other algorithms on Micropolis. Is it possible to connect gym_micropolis / envs / env.py to other algorithms (eg stable_baselines) as a gym environment?

from gym-city.

smearle commented on July 2, 2024

Funny, this repository is actually an adaptation of the following: https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail, which is itself adapted from OpenAI baselines.

I suppose one could use this environment with a different learning algorithm. There would be two roadblocks for you to overcome, at least that spring to mind:

The function setMapSize() passes a bunch of micropolis-specific options to the environment. So you'd need to edit env.py to call this in the init() function, with predetermined options.
The action space in Micropolis is very large. It corresponds to a flattened 2D image the size of the map, with number of channels corresponding to number of tile-specific actions, so you'd need to make sure the model being used was compatible with this (i.e., I'm exploring fully convolutional NN's that retain a spatial correspondence between the observation and action spaces - you can find them in model.py; I don't think you'd want to use a net that bottlenecks too drastically - Atari has a much smaller action space, for example).

Indeed it would be nice for this to be incorporated as part of the standard family of gym environments. I am working on it here for now because I'm still experimenting with the environment itself, e.g., playing with different reward functions, designing mini-games, playing with different map sizes.

Which algorithm were you thinking of using in particular (stable_baselines is a collection of algorithms, not one single algorithm)?

from gym-city.

smearle commented on July 2, 2024

Feel free to shoot me an email if you have any more questions, ideas, or want to chat about RL :)

from gym-city.

Tsunehiko commented on July 2, 2024

Thank you for your kind explanation and kindness.
The reason I used stable_baselines is that I wasn't used to RL, and stable_baselines seemed to be the easiest way to handle the algorithm. (There was no particular emphasis on the algorithm.)

In model.py, an error occurs when the modules of densenet_pytorch and ConvLSTMCell are not found. Which module should I install?

I will ask you by email from next.

from gym-city.

smearle commented on July 2, 2024

Aha, densenet_pytorch is an old dependency, so I got rid of that, and I added ConvLSTMCell.py to the repo. Do a 'git pull' from inside the repo and try again!
Thanks for bringing my attention to these problems.

from gym-city.

smearle commented on July 2, 2024

And indeed, you'd probably be wise to start with RL by playing with stable_baselines and some Atari games, etc. But I selfishly would rather you play with this repo because you're helping me troubleshoot it :)

from gym-city.

Tsunehiko commented on July 2, 2024

Thank you for your quick response. I'm glad if I'm useful too.
Can I use the --no-cuda option? I can not use a machine with GPU now, but I have to handle it only with CPU :(

By the way, I'm thinking of using this repo for at least July.

from gym-city.

smearle commented on July 2, 2024

Not ideal, but no problem. Just patched up something that was in the way of using no-cuda. I've now got the above command working with --no-cuda. Do another pull and let me know if you have any luck.

from gym-city.

Tsunehiko commented on July 2, 2024

python3 main.py --experiment test000 --model FullyConv --num-process 24-map width 16 --render --overwrite --algo a2c --no-cuda
I ran the above command. The displayed screen only passes time, and no action is taken. How can I see how agents are playing? May I use enjoy.py?

from gym-city.

smearle commented on July 2, 2024

Strange, I can't replicate this. What are you getting in the command line?

from gym-city.

Tsunehiko commented on July 2, 2024

The command line output is very large, so I don't know which one to write. I wrote the last command line output. Also, the output of the screen at that time looks like an attached image.

PLAYCITY
('PAUSED', False, 'running', True)
PLAYMODE
('PAUSED', False, 'running', True)
len of intsToActions: 4864
 num tools: 19

len of intsToActions: 4864
 num tools: 19
('WINDOW SIZE', 800, 608)
len of intsToActions: 4864
 num tools: 19
('WINDOW SIZE', 800, 608)
{'Road': ['Road', 'Wire', 'Rail', 'Water'], 'Wire': ['Road', 'Wire', 'Rail', 'Water'], 'Rail': ['Rail', 'Wire', 'Road', 'Water'], 'Water': ['Road', 'Water', 'Wire', 'Rail'], 'Net': ['Net', 'Airport'], 'Airport': ['Net', 'Airport']}
PLAYCITY
('PAUSED', False, 'running', True)
PLAYMODE
('PAUSED', False, 'running', True)
len of intsToActions: 4864
 num tools: 19
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png

from gym-city.

smearle commented on July 2, 2024

This is the most recent command line output? If time is passing on the map, then training must be underway, so you must be getting printouts indicating avg. reward, number of frames and the like. Try the same with --log-interval 1 so that this printout will occur as frequently as possible. Depending on your system training might be very slow. You might also adjust --num-proc 2, so that only 2 games will run during training (there's a bug with only 1, at the moment). This will speed up gameplay in those individual games.

from gym-city.

Tsunehiko commented on July 2, 2024

I tried both --log-interval 1 and --num-proc 2, but the map still displayed does not change, only time is advancing. The output displayed at the end of the command line is shown below.

==== STARTGAME                                                                                       
GENERATECITY                                                                                         
STARTMODE                                                                                            
('PAUSED', True, 'running', False)                                                                   
('SWITCHPAGE', <micropolisnotebook.MicropolisNotebook object at 0x7fd3f24d3e10 (pyMicropolis+micropolisEngine+micropolisnotebook+MicropolisNotebook at 0x563695516510)>, <micropolisnotebook.MicropolisNot
ebook object at 0x7fd3f24d3e10 (pyMicropolis+micropolisEngine+micropolisnotebook+MicropolisNotebook at 0x563695516510)>, <micropolisnoticepanel.MicropolisNoticePanel object at 0x7fd3f24d5630(pyMicropolis+micropolisEngine+micropolisnoticepanel+MicropolisNoticePanel at 0x563695518340)>, 0)              
('WINDOW SIZE', 800, 608)                                                                            
('WINDOW SIZE', 800, 608)                                                                            
('WINDOW SIZE', 800, 608)                                                                            
('WINDOW SIZE', 800, 608)                                                                            
{'Road': ['Road', 'Wire', 'Rail', 'Water'], 'Wire': ['Road', 'Wire', 'Rail', 'Water'], 'Rail': ['Rail', 'Wire', 'Road', 'Water'], 'Water': ['Road', 'Water', 'Wire', 'Rail'], 'Net': ['Net', 'Airport'], 'Airport': ['Net', 'Airport']}                                                                        
PLAYCITY                                                                                             
('PAUSED', False, 'running', True)                                                                   
PLAYMODE                                                                                             
('PAUSED', False, 'running', True)                                                                   
{'Road': ['Road', 'Wire', 'Rail', 'Water'], 'Wire': ['Road', 'Wire', 'Rail', 'Water'], 'Rail': ['Rail', 'Wire', 'Road', 'Water'], 'Water': ['Road', 'Water', 'Wire', 'Rail'], 'Net': ['Net', 'Airport'], 'Airport': ['Net', 'Airport']}                                                                        
PLAYCITY                                                                                             
('PAUSED', False, 'running', True)                                                                   
PLAYMODE                                                                                             
('PAUSED', False, 'running', True)                                                                   
len of intsToActions: 4864                                                                           
 num tools: 19                                                                                       
len of intsToActions: 4864                                                                           
 num tools: 19                                                                                       
BASE NETWORK:                                                                                        
 MicropolisBase_FullyConv(                                                                           
  (embed): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1))                                         
  (k5): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))                            
  (k3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))                            
  (val_shrink): Conv2d(32, 32, kernel_size=(2, 2), stride=(2, 2))                                    
  (val): Conv2d(32, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))                            
  (act): Conv2d(32, 19, kernel_size=(1, 1), stride=(1, 1))                                           
)                                                                                                    
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png             
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png

from gym-city.

smearle commented on July 2, 2024

How fast is time moving? An update only occurs for me in the late 1900s. You could make --map-size 4 and --max-step 16? This should make updates occur by 1950 latest.

from gym-city.

Tsunehiko commented on July 2, 2024

I tried --map-size 4 and --max-step 16, but the speed of the time was ultrafast and increased infinitely beyond the 1950s.

from gym-city.

smearle commented on July 2, 2024

I'm at a loss. I can only recommend looking at line 265 in main.py, which should be printing out, and working backward from there with print statements, trying to find out why 265 is never reached.

from gym-city.

Tsunehiko commented on July 2, 2024

Thank you for checking carefully.
I examine the code and try it myself. I will contact you again if I have any questions.

from gym-city.

smearle commented on July 2, 2024

Also, try without --render, and see if you get any printouts.

from gym-city.

Tsunehiko commented on July 2, 2024

Also, an error like an image has appeared. How do I fix this error?

from gym-city.

smearle commented on July 2, 2024

That's some good news! I suspected that the problem stemmed from the GUI. In particular, there must be a call to gtk.main_interation(), or something to that effect, hidden somewhere. This function runs the GUI indefinitely, waiting for user input, so it would stop our training code dead in its tracks (and simply let the game run very fast). Strange that I don't experience the same issue on my end, and can't find a call to the function in the code. Might be an operating system-specific issue.

As for the "Unable to init server" error, I think this might be the result of too many dead python processes hanging out on the cpu (interrupting training is not yet handled gracefully by the code). Try again after pkill -python or simply after restarting your machine.

To see if the bot's doing anything, git pull the repo, then try again with the option --print-map which will print an array representing (the 0th-ranked environment's) game map, with different tile-types corresponding to different integers. If this map is not changing, then the bot is not building on the map.

from gym-city.

Tsunehiko commented on July 2, 2024

The array displayed by --print-map has changed, so it seems that learning is possible. How do I check the bots I learned? Should I use enjoy.py?

Also, "Unable to init server" is an error that was displayed when I tried using a GPU server that I can use, not my machine. This may be the cause.

from gym-city.

smearle commented on July 2, 2024

Yes, you can try something like python enjoy.py --render --map-size 16 --load-dir a2c_FullyConv_w16/test000 (I wonder if we'll get stuck in the same GUI loop though!).

from gym-city.

which options about gym-city HOT 24 CLOSED

Comments (24)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent