smearle / gym-city Goto Github PK

An interface with micropolis for city-building agents, packaged as an OpenAI gym environment

License: MIT License

Python 18.17% Makefile 0.89% Tcl 8.44% C++ 26.03% C 36.86% C# 0.75% HTML 2.66% CSS 0.14% Batchfile 0.01% Shell 0.20% Yacc 0.17% sed 0.01% Objective-C 0.21% Roff 0.01% Perl 0.04% Java 3.25% JavaScript 2.09% PHP 0.02% TeX 0.07% Dockerfile 0.02%

gym-city's Introduction

The work in this repo was presented and demoed at the 2019 Experimental A.I. in Games (EXAG) workshop, an AIIDE workshop. You can read the paper here: http://www.exag.org/papers/EXAG_2019_paper_21.pdf

Feel free to join the conversation surrounding this work via my Twitter, and on r/MachineLearning.

gym-city

A Reinforcement Learning interface for variable-scale city-planing-type gym environments, including Micropolis (open-source SimCity 1) and a 1-player version of Conway's Game of Life.

Micropolis (SimCity 1)

The player builds places urban structures on a 2D map. In certain configurations, these structures invite population and vertical development. Reinforcement Learning agents are rewarded as a function of population or other city-wide metrics.

An agent plays on the 16x16 map on which it was trained. A human player helps satisfy demand.

An agent upscales to a 32x32 map without additional training.

(cont'd from above) A human player incites exploration of city-space via deletion of key features.

A human player works alongside an agent trained to maximize traffic.

1-Player Game of Life

The agent interacts with a randomly-initialized Conway's Game of Life board, placing or deleting one cell per tick, seeking to maximize population.

An agent plays on the 16x16 map on which it was trained.

An agent upscales to a 32x32 map without additional training. A human player sabotages the agent by exploiting oscillators.

Power Puzzle

A SimCity map spawns with one power plant, and several residential zones, all randomly placed, and the bot is restricted to building power lines.

An agent upscales to a 32x32 map without additional training, with mixed results.

Human player gives pointers to the agent.

Installation

Ubuntu

Make sure python >= 3.6 is installed.

For Micropolis, we need the python3 header files, gtk and cairo, and swig:

sudo apt-get install python3-dev libcairo2-dev python3-cairo-dev libgirepository1.0-dev swig
pip install gobject PyGObject

Clone this repository, then, to make Micropolis.

cd gym-city
make install

We also need pytorch, since we'll be using it to build and train our agents, and tensorflow (since baselines depends on it).

To install the python module:

pip3 install -e .

We might need some additional python packages:

pip install cffi mpi4py gym baselines graphviz torchsummary imutils visdom sklearn matplotlib torchsummary

Basic Use

From this directory, run:

from gym_micropolis.envs.corecontrol import MicropolisControl
m = MicropolisControl(MAP_W=50, MAP_H=50)
m.layGrid(4, 4)

Training w/ RL

To use micropolis as a gym environment, install gym.

To train an agent using A2C:

python3 main.py --experiment test_0 --algo a2c --model FullyConv --num-process 24 --map-width 16 --render

To visualize reward, in a separate terminal run: python -m visdom.server

To run inference using the agent we have just trained:

python3 enjoy.py --load-dir <directory/containing/MicropolisEnv-v0.tar> --map-width 16

Generally, agents quickly discover the power-plant + residential zone pairing that creates initial populations, then spend very long at a local minima consisting of a gameboard tiled by residential zones, surrounding a single power plant. On more successful runs, the agent will make the leap toward a smattering of lone road tiles place next to zones, at least one per zone to maximize population density.

I'm interested in emergent transport networks, though I'm not sure the simulation is complex enough for population-optimizing builds to require large-scale transport networks.

We can, however, reward traffic density just enough for continuous/traffic-producing roads to be drawn between zones, but not so much that they billow out into swaths of traffic-exploiting asphalt badlands.

Interacting w/ the Bot

During training and inference, a micropolis gui will be rendered by default. During training, it is the environment of rank 0 if multiple games are being run in parallel. The gui is controlled by the learning algorithm, and thus can be laggy, but is fully interactive, so that a player may build or delete zones during training and inference.

Any action that is successfully performed by the player on the subsection of the larger game-map which corresponds to the bot's play-area, is registered by the bot as an action of its own, sampled from its own action distribution, so that the player can influence the agent's training data in real time.

gym-city's People

Contributors

Stargazers

Watchers

Forkers

wty0512 prince4547 saritakaloya zegerk branch3 red8top etienne-meunier jonathanfly geotyper adewin wwxfromtju hell-to-heaven liuwenhaha njustesen tearitco karthiknrao rainwangphy minasmayth

gym-city's Issues

No module named 'gym_pcgrl'

I am on the pcgrl_mix branch and having this no module error. And I've tried renamed gym_city to gym_pcgrl in setup.py and rerun make install and pip3 install -e . but still having this error. Any idea what else need to change?

which options

Hello. I am not very familiar with reinforcement learning.

python3 main.py --log-dir trained_models/acktr --algo acktr --model squeeze --num-process 24 --map-width 27 --render

In the above, main.py does not work. Please tell me the options you need to run 'main.py'.

error: ‘DOMAIN’ undeclared (first use in this function)

Hi Sam,

I am trying to build micropolis and play with the game. Firstly tried https://github.com/SimHacker/micropolis but can't install pycairo due to python2.7 sunset. And when running make on gym_city/envs/micropolis I have the following errors. Any ideas?

gcc -O3 -DIS_LINUX  -I../../tcl  -DTCL_HAVE_SETLINEBUF -DTCL_32_BIT_RANDOM -DTCL_POSIX_SIG -DTCL_TM_GMTOFF   -c -o tclxfmat.o tclxfmat.c
tclxfmat.c: In function ‘ReturnFPMathError’:
tclxfmat.c:125:13: error: ‘DOMAIN’ undeclared (first use in this function)
  125 |        case DOMAIN:
      |             ^~~~~~
tclxfmat.c:125:13: note: each undeclared identifier is reported only once for each function it appears in
tclxfmat.c:128:13: error: ‘SING’ undeclared (first use in this function)
  128 |        case SING:
      |             ^~~~
tclxfmat.c:131:13: error: ‘OVERFLOW’ undeclared (first use in this function); did you mean ‘EOVERFLOW’?
  131 |        case OVERFLOW:
      |             ^~~~~~~~
      |             EOVERFLOW
tclxfmat.c:134:13: error: ‘UNDERFLOW’ undeclared (first use in this function); did you mean ‘EOVERFLOW’?
  134 |        case UNDERFLOW:
      |             ^~~~~~~~~
      |             EOVERFLOW
tclxfmat.c:137:13: error: ‘TLOSS’ undeclared (first use in this function)
  137 |        case TLOSS:
      |             ^~~~~
tclxfmat.c:138:13: error: ‘PLOSS’ undeclared (first use in this function)
  138 |        case PLOSS:
      |             ^~~~~

A3Clstm undefined

Great work on this project so far, looks really promising.

In trying to replicate model training, I have noticed the following error:

Traceback (most recent call last):
File "main.py", line 179, in
shared_model = A3Clstm(env.observation_space.shape[0], env.action_space)
NameError: name 'A3Clstm' is not defined

I am assuming that this class is imported from model.py, as implemented by dgriff777 in https://github.com/dgriff777/rl_a3c_pytorch/blob/master/model.py

However by doing so I've got another error:

File "***/gym-micropolis/a3c/train.py", line 38, in train
player = Agent(None, env, args, None)
UnboundLocalError: local variable 'env' referenced before assignment

Maybe I'm just making a mistake somewhere. If you manage to replicate these errors, could you let me know a potential workaround?

Thanks

Makefile still points to gym_micropolis directory

This seems like a one-line fix to change to gym_city

PPO RuntimeError: The size of tensor a (5) must match the size of tensor b (4) at non-singleton dimension 0

Hi Sam,

I have A2C and ACKTR running fine but when running PPO I am having the following error. Any ideas?

Traceback (most recent call last):
  File "train.py", line 569, in <module>
    main()
  File "train.py", line 27, in main
    trainer.main()
  File "train.py", line 317, in main
    self.train()
  File "train.py", line 440, in train
    value_loss, action_loss, dist_entropy = agent.update(rollouts)
  File "/home/lecky/Workspaces/gym-city/algo/ppo.py", line 35, in update
    advantages = rollouts.returns[:-1] - rollouts.value_preds[:-1]
RuntimeError: The size of tensor a (5) must match the size of tensor b (4) at non-singleton dimension 0

ValueError: You are trying to load a weight file containing 5 layers into a model with 7 layers.

Hi,

When i am trying to run learn.py either test or train, i am getting the above error. Can you please help me with the same. My keras Version 2.2.4.Below is the exact error i am getting.

Thanks
Prince

=================================================================
Total params: 262,057
Trainable params: 262,057
Non-trainable params: 0

None
Traceback (most recent call last):
File "learn.py", line 179, in
dqn.load_weights(args.weights)
File "/usr/lib/python2.7/site-packages/rl/agents/dqn.py", line 209, in load_weights
self.model.load_weights(filepath)
File "/usr/lib64/python2.7/site-packages/keras/models.py", line 733, in load_weights
topology.load_weights_from_hdf5_group(f, layers)
File "/usr/lib64/python2.7/site-packages/keras/engine/topology.py", line 3115, in load_weights_from_hdf5_group
str(len(filtered_layers)) + ' layers.')
ValueError: You are trying to load a weight file containing 5 layers into a model with 7 layers.

ConnectionResetError: [Errno 104] Connection reset by peer

I am running the latest code and having the following error, looks like it's trying to connect to a server but the Visdom is up running

Traceback (most recent call last):
  File "train.py", line 574, in <module>
    main()
  File "train.py", line 26, in main
    trainer = Trainer()
  File "train.py", line 113, in __init__
    args=args)
  File "/usr/src/app/envs.py", line 296, in make_vec_envs
    envs = SubprocVecEnv(envs)
  File "/usr/src/app/subproc_vec_env.py", line 83, in __init__
    observation_space, action_space = self.remotes[0].recv()
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer

Also notice multiple lines of the following messages, also seems trying to connect to server.

Unable to init server: Could not connect: Connection refused

Also notice multiple lines of the following messages

MicropoligGenericEngine initGamePython: This should be called at the end of the concrete subclass's __init__ method.

Any idea what went wrong?

Microplois code uses python2 base and gym_micropolis in python3 and some conflicts

I have tried above code in Linux Centos 7(64 bit version) with gi package( I have changed to pgi ) is not compatible with python3 and some Kernel error .