oxwhirl / smac Goto Github PK

View Code? Open in Web Editor NEW

1.1K 1.1K 227.0 2.58 MB

SMAC: The StarCraft Multi-Agent Challenge

License: MIT License

Python 100.00%

benchmark machine-learning multiagent-systems reinforcement-learning starcraft-ii

smac's People

Contributors

Stargazers

Watchers

Forkers

ericl richardliaw timrudner soygema wwxfromtju collector-m bluecontra johnyfeng gyunt praveern paleroy tjuhaoxiaotian tju-marl gdragnil liuwq168 rapsealk mdrw shubhampachori12110095 xuexixuexihaha pestyvesty zymrael wsjeon shamcondor junkilee starry-sky6688 mawright canyon alekat13 bolshakovofficial ljp580230 rainwangphy schroederdewitt moshanghuakai-pang lamperougeyxy fanbbbb ricardo-charles mjbooo lionsq johan-kallstrom hongdazhang curieuxjy ludovic-carre ssinghzar douglasrizzo weigoss masteryin sunnyem goingmyway kjyeung llt1 zhangyx96 footoredo morrismodel arn22433 kalialeksiev mj10 leimao splendor-kill samvelyan lfhase dragonvanken ankur-deka yuchen-x tarun018 jshe vivh3 hanbumko zhengbenchang caiyangcy linusec wanliee chunjiangmonkey alvaro-serra leo-ryu crakane lijiahui08 zxyam jonarain airicky kristoffliu joosephook andyyue1893 4ever-rain yaxuniu avalacher lkunelk rwill128 lyu-xg andrewtanjs gaosihua dimikout3 rodrigodelazcano majeriot xfyecn sud0x67 noncomposmentis zyc9894 dwyzzy yuanleirl youpengzhao

smac's Issues

Question on difficulties

Even I try the maximum difficulty ("7": sc_pb.VeryHard), and when the number of agents is larger than the number of enemies ("27m_vs_30m"), the agent wins in every scenario without training (random action). Is there a way to adjust the rule of enemies or custom maps?

EGL display not working, though EGL library is successfully loaded

sorry, I think I'm having a similar problem to issue #3 on Ubuntu 18.04 as well. "python -m pysc2.bin.agent --map Simple64" works fine with GUI but "python -m smac.examples.random_agents" generates similar error message:

python -m smac.examples.random_agents

Version: B75689 (SC2.4.10)
Build: Aug 12 2019 17:16:57
Command Line: '"/home/user1/StarCraftII/Versions/Base75689/SC2_x64" -listen 127.0.0.1 -port 19757 -dataDir /home/user1/StarCraftII/ -tempDir /tmp/sc-pkx0j8ie/ -eglpath libEGL.so'
Starting up...
Startup Phase 1 complete
Startup Phase 2 complete
Attempting to initialize EGL from file libEGL.so ...
Successfully loaded EGL library!
Failed to create and initialize a valid EGL display! Devices tried: 0

CreateInitializedEGLDisplay Failed.
Failed to initialize GL framework
Creating stub renderer...
Listening on: 127.0.0.1:19757
Startup Phase 3 complete. Ready for commands.
ConnectHandler: Request from 127.0.0.1:57120 accepted
ReadyHandler: 127.0.0.1:57120 ready
Requesting to join a single player game
Configuring interface options
Configure: raw interface enabled
Configure: feature layer interface disabled
Configure: score interface disabled
Configure: render interface disabled
Launching next game.
Next launch phase started: 2
Next launch phase started: 3
Next launch phase started: 4
Next launch phase started: 5
Next launch phase started: 6
Next launch phase started: 7
Next launch phase started: 8
Game has started.
Successfully loaded stable ids: /home/user1/StarCraftII/stableid.json
Sending ResponseJoinGame
Total reward in episode 0 = 2.0625
Total reward in episode 1 = 0.9375
Total reward in episode 2 = 1.5
Total reward in episode 3 = 2.25
Total reward in episode 4 = 1.125
Total reward in episode 5 = 1.875
Total reward in episode 6 = 2.4375
Total reward in episode 7 = 1.5
Total reward in episode 8 = 2.25
Total reward in episode 9 = 1.875
RequestQuit command received.
Closing Application...
DataHandler: unable to parse websocket frame.
CloseHandler: 127.0.0.1:57120 disconnected
ResponseThread: No connection, dropping the response.

Thanks again for any possible help in this issue!

How to set visualize the game？

When I used "python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z",I want to visualize the game to see how agents work.
But i don't find any parameters can do this.

SC2 screen message during training

Hi,
Can you please let me know how to get rid of or delete the message (CHEAT local player KillUnitTagBy..) that appears on the game screen during the training?
With many thanks

ubuntu failed to find EGL functions in library file!

First, I'd like to thank you for all release,which helps me alot in my research.

However, when i run this command "python -m smac.examples.random_agents", the output display that after Attempting to initialize EGL from file linEGL.so.1 ... failed to findEGL functions in libary file!

Qmix code seems cannot learn good policy

I run the example code of qmix and it seems that it cannot learn good policy. The reward is nearly random policy.

Add new units into a map

I want to add some new RL units into a map with the editor. The second to last paragraph of the documentation mentions the difference between "RL" and regular units. I tried to modify the corresponding options in the editor, but it doesn't work when smac trying to 'Sending ResponseJoinGame'. It seemed stuck forever... Any suggestion?

More maps?

Hi, I want to evaluate on the 1c_3s_5z map, which is used as a benchmark for the QMIX paper, I can not find it under this directory: smac/smac/env/starcraft2/maps/SMAC_Maps/
Can you upload this map? Or is there any equivalent maps I can use? Thanks!

Why are unit type IDs handled differently between allied and enemy units?

I have a question concerning the get_unit_type_id() method. I noticed that there is a different procedure to generate unit types for allied and enemy units. This results in units of the same type (inside SC2) to have different type IDs returned by get_unit_type_id() depending if they are allies or enemies. However, this only happens in some maps, while other maps keep consistent enemy type IDs for allies and enemies (I believe that by coincidence and not on purpose). For example:

in maps 2s3z and 3s5z_vs_3s6z, allied stalkers are represented (in one-hot encoding) as 10 and zealots as 01, while enemy stalkers are represented as 01 and zealots as 10.
In MMM2, Marauders, Marines and Medivacs are labeled as 001, 010 and 100 for both allied and enemy units.

Is there a reason why allied and enemy unit type IDs are treated differently? Would there be any side effects if they were to be handled in the same way?

RuntimeError: SC2_x64'

RuntimeError: Trying to run '/home/tsou/StarCraftII/Versions/Base55958/SC2_x64', but it isn't executable.

How to stop an episode after a number of steps and still receive negative reward

Hi there. I was wondering if there is a way to end an episode after a certain number of steps, ensuring the agents receive the proper negative reward for failing. I was doing it manually by counting the number of steps in my own loop and calling env.reset() after a set number of steps, but this way I don't get a properly subtracted/scaled reward from the environment.

How to save/reload the game env's seed?

To Whom It May Concern,

I am working on implementing saving ckpt for my experiments, but I got a problem with getting the random state of the game env and re-setting it when I resume my experiments.

I noticed that the game env is created in the following way:

create = sc_pb.RequestCreateGame(local_map=sc_pb.LocalMap(), realtime=False, random_seed=self._seed)

where the seed is passed in.

I was wondering:

how to get the random state of the created game when saving the ckpt?
how to set the random state for the game env using saved random state type value rather than a seed number?

Thank you very much!

MultiAgentEnv interface documentation

I'm trying to inherit MultiAgentEnv to test a custom environment with QMIX (using pymarl); however the documentation for the methods does not specify the types returned by the functions, so I am unsure of what exactly they should return or whether they are all used.

It's also unclear whether all functions need to be implemented, in particular functions that have both a "global" version and an _agent version must be all implemented (for example get_obs and get_obs_agent), but also render etc.

Would it be possible to add some more info on how to extend MultiAgentEnv? A clear example can also work, but in general I guess it's hard to understand from Python code the exact types so I'm not sure whether it would be enough.

Can I change the speed of render?

I want to know my agent works well by watching game, but game speed is so fast that I dont know whats happening.

I know that I can change the in-game speed by '+' and '-' key, but even if I slow down in-game speed, game is still too fast to watch.

Is there any command or something to slow down game speed??

Adding own map to SMAC_maps

Hey there,

Thanks so much for open-source this project in a turn for a multiagent cooperative environment for Starcraft II research environment 👍 !

I am on the current task of adding a new map to the SMAC environment , for that, I´ve already

Add the .SC2 map file to smac/env/starcraft2/maps/SMAC_Maps and Applications/StarCraftII/Maps/SMAC_Maps folders
Add the current structure as a dictionary following the structure from other maps smac/env/starcraft2/maps/smac_maps.py file in map_param_registry
},
"HallucinIce": {
"n_agents": 4,
"n_enemies": 6,
"limit": 180,
"a_race": "P",
"b_race": "T,
"unit_type_bits": 0,
"map_type": "sentry",
},
Changed smac/examples/random_agents.py 7th line for

env = StarCraft2Env(map_name="HallucinIce")

However, when I run the example python file , it gives me the following error :

  File "random_agents.py", line 43, in <module>
    main()
  File "random_agents.py", line 10, in main
    env = StarCraft2Env(map_name="HallucinIce")
  File "/Users/g/Desktop/SMAC/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 180, in __init__
    map_params = get_map_params(self.map_name)
  File "/Users/g/Desktop/SMAC/lib/python3.6/site-packages/smac/env/starcraft2/maps/__init__.py", line 10, in get_map_params
    return map_param_registry[map_name]
KeyError: 'HallucinIce'

I interpret from the error that map_param_registry is unable to read the HallucinIce dictionary addition structure and not call to get_map_params ... even though I´ve added to smac_maps.py file. Any idea or guide on how could I solve this issue ? :)

Thanks in advance for the time dedicated to this issue .

showing gui

Thanks for putting together this repo.

I am trying to figure out how to show a game window to see what is happening. When I run python -m smac.examples.random_agents or the sample program from the README, nothing seems to show up. Can you help me figure out what might be going wrong?

I am using Starcraft II v 4.7.1 and am on 7678571 in this repo. Here is the output when I run python -m smac.examples.random_agents. When I run python -m pysc2.bin.agent --map Simple64 from pysc2 a window seems to pop up fine (the render interface disabled line also shows up there too when I can see a gui).

Also, I noticed that nothing is showing up in ~/StarCraftII/Replays. Where does a replay show up? Thanks.

Version: B70326 (SC2.2018Season4)
Build: Nov 27 2018 03:26:30
Command Line: '"/home/esquires3/StarCraftII/Versions/Base70154/SC2_x64" -listen 127.0.0.1 -port 20293 -dataDir /home/esquires3/StarCraftII/ -tempDir /tmp/sc-5lbi516w/ -displayMode 0 -windowwidt
h 1920 -windowheight 1200 -windowx 50 -windowy 50 -eglpath libEGL.so'
Starting up...
Startup Phase 1 complete
Startup Phase 2 complete
Attempting to initialize EGL from file libEGL.so ...
Failed to find EGL functions in library file!
Creating stub renderer...
Listening on: 127.0.0.1:20293
Startup Phase 3 complete. Ready for commands.
Requesting to join a single player game
Configuring interface options
Configure: raw interface enabled
Configure: feature layer interface disabled
Configure: score interface disabled
Configure: render interface disabled
Entering load game phase.
Launching next game.
Next launch phase started: 2
Next launch phase started: 3
Next launch phase started: 4
Next launch phase started: 5
Next launch phase started: 6
Next launch phase started: 7
Next launch phase started: 8
Game has started.
Sending ResponseJoinGame
Total reward in episode 0 = 0.375
Total reward in episode 1 = 0.75
Total reward in episode 2 = 0.5625
Total reward in episode 3 = 0.9375
Total reward in episode 4 = 0.75
Total reward in episode 5 = 0.1875
Total reward in episode 6 = 0.9375
Total reward in episode 7 = 0.5625
Total reward in episode 8 = 0.5625
Total reward in episode 9 = 0.5625
RequestQuit command received.
Closing Application...
unable to parse websocket frame.
Terminate action already called.
Entering core terminate.
Core shutdown finished.

Is it possible to configure any number of agents?

Is it possible to configure any number of agents or enemies?

Refer to #14

In smac_maps.py
"3m": {
"n_agents": 3,
"n_enemies": 3,
"limit": 60,
"a_race": "T",
"b_race": "T",
"unit_type_bits": 0,
"map_type": "marines",
}

say if I want 5 agents, then I change n_agents to 5,
However, it seems directly change this the RL algorithm will throw and Exception.

so is the number of agents or enemies are binded to the specific map file. so the smac_maps.py file just repeat the dict entry for other number of agents like 8m, 25m??

ValueError: Map 'SMAC_Maps/2s3z.SC2Map' not found.

Hi everyone, I'm trying to run an experiment using python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z from the project https://github.com/oxwhirl/pymarl, and I got this Error,

ValueError: Map 'SMAC_Maps/2s3z.SC2Map' not found.
The error message in detail is presented in the picture
Can I get help please !
Thank you

how to close the game render???

I want to close the render,but have not found some functions to control render

"Failed to connect to the SC2 websocket. Is it up?"

[user@localhost ~]$ python -m smac.examples.random_agents
/home/user/SMAC/StarCraftII/Versions/Base75689/SC2_x64: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /home/user/SMAC/StarCraftII/Libs/libstdc++.so.6)
WARNING:absl:SC2 isn't running, so bailing early on the websocket connection.
Traceback (most recent call last):
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/smac/examples/random_agents.py", line 43, in <module>
    main()
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/smac/examples/random_agents.py", line 19, in main
    env.reset()
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 347, in reset
    self._launch()
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 297, in _launch
    self._sc2_proc = self._run_config.start(window_size=self.window_size, want_rgb=False)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/pysc2/run_configs/platforms.py", line 205, in start
    want_rgb=want_rgb, extra_args=extra_args, **kwargs)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/pysc2/run_configs/platforms.py", line 88, in start
    self, exec_path=exec_path, version=self.version, **kwargs)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/pysc2/lib/sc_process.py", line 143, in __init__
    self._host, self._port, self, timeout_seconds=timeout_seconds)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/pysc2/lib/remote_controller.py", line 146, in __init__
    sock = self._connect(host, port, proc, timeout_seconds)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/pysc2/lib/stopwatch.py", line 212, in _stopwatch
    return func(*args, **kwargs)
  File "/home/user/.conda/envs/hjy_torch/lib/python3.6/site-packages/pysc2/lib/remote_controller.py", line 180, in _connect
    raise ConnectError("Failed to connect to the SC2 websocket. Is it up?")
pysc2.lib.remote_controller.ConnectError: Failed to connect to the SC2 websocket. Is it up?

Can someone tell me how to run SC2？

Model Size

Dear authors,

In the paper, it is not clear what the FCN model size is in the first set of layers before the GRU. Would it be possible to clarify this? Or release the model here on github?

Redis cannot be initialized when running the qmix sample

File "G:\anaconda3\envs\SMAC\lib\site-packages\ray\services.py", line 582, in wait_for_redis_to_start
raise RuntimeError("Unable to connect to Redis. If the Redis instance "
RuntimeError: Unable to connect to Redis. If the Redis instance is on a different machine, check that your firewall is configured properly.

Process finished with exit code 1

RLlib examples not working

I downloaded the smac and run the random_agents example, everything seems to work correctly. However, when I am trying to use the RLlib examples I get some error:

Traceback (most recent call last):
File "run_qmix.py", line 18, in <module>
   from smac.examples.rllib.env import RLlibStarCraft2Env
 File "/home/dkoutras/.conda/envs/GEP/lib/python3.8/site-packages/smac/examples/rllib/__init__.py", line 2, in <module>
   from smac.examples.rllib.model import MaskedActionsModel
 File "/home/dkoutras/.conda/envs/GEP/lib/python3.8/site-packages/smac/examples/rllib/model.py", line 7, in <module>
   from ray.rllib.models import Model
ImportError: cannot import name 'Model' from 'ray.rllib.models' (/home/dkoutras/.conda/envs/GEP/lib/python3.8/site-packages/ray/rllib/models/__init__.py)

It seems that there is some kind of problem with the latest versions of rllib which support modelv2 or something like that

Any ideas ?

Watching a replay

The only way to watch a replay on SMAC is to play it from the windows game ? There is no way to watch it with deteriorated graphics as in pysc2 ?
What is the meaning of the "watch a replay" section on README then ?
I tried to naively watch a replay as you do in pysc2 and had the following error:
File "/usr/local/lib/python3.7/site-packages/pysc2/lib/stopwatch.py", line 212, in _stopwatch return func(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/pysc2/lib/features.py", line 305, in color return self.palette[plane] IndexError: index 1970 is out of bounds for axis 0 with size 1962

Regarding Relative Representation of Health, Shields, Energy and Cool-down

Hey folks. First things first, great work on the library! We found it easy to adapt to our needs.

This is regarding one of the things we observed, and deem worth sharing: namely the normalisation of health, shields, energy and cool-down. It appears that these are normalised by their maximum value, and while I see why it's a reasonable design decision, it distorts the game specified notion of hit-points.

For instance, a Colossus has 200 health points, while a Stalker has 80. When normalised by max health, a delta of 0.06 in the (relative) health corresponds to 5 health points for a Stalker, but 12 for a Colossus. This might be significant loss of information for an agent, given the fact that unit attacks are measured in damage per second (e.g. delta in hit points per second, ignoring splash damage).

One solution is to simply scale health-points by a fixed number (e.g. 1/45, such that a healthy Marine has health 1). This is appealing because it allows a (hypothetical) agent to compare the health of two or more different unit types simply by considering the difference between the corresponding inputs (same applies for shields, energy and cool-down). Further, SC2 has rich pairings between unit types, in the sense that some unit types are particularly effective against (or vulnerable to) other unit types. Having absolute measures might help the agent better exploit these.

While this issue is just a FYI, it might be worth considering to give the user the option to specify the normalisation (defaulting to normalisation by max, if need be).

Thanks. :)

Disabling the GUI

Hi,

Thanks for creating this repo! It's really lovely. I'm wondering if there's a param anywhere to disable the GUI display? It'd be nice to run it in the background, or on AWS, without the GUI.

Linux Sc2 process uses GPU memory

Hello,

First i'd like to thank you for these repos (along with pymarl)!

My problem is similar to the one here: https://github.com/deepmind/pysc2/issues/235

So, when using smac, each linux sc2 process takes memory of the GPU but this is not the case when using pysc2. Did I miss something??

When running: python -m smac.examples.random_agents, i got this output:

And here is the gpu:

But when i'm running python -m pysc2.bin.agent --map Simple64, i got no memory consumption:

pathing_grid issue

New version of SCII is causing problems to several APIs (smac included). The first problem caused by the update seems to be related to the pathing grid's shape (data values have been turned into bits).

bits_per_pixel: 1
size {
  x: 32
  y: 32
}
data: "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374?\377\377\374\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-bbe461e77b17> in <module>
      5 
      6 for e in range(n_episodes):
----> 7     env.reset()
      8     terminated = False
      9     episode_reward = 0

~\.conda\envs\rl\lib\site-packages\smac\env\starcraft2\starcraft2.py in reset(self)
    320         if self._episode_count == 0:
    321             # Launch StarCraft II
--> 322             self._launch()
    323         else:
    324             self._restart()

~\.conda\envs\rl\lib\site-packages\smac\env\starcraft2\starcraft2.py in _launch(self)
    311         self.pathing_grid = np.flip(
    312             np.transpose(np.array(list(map_info.pathing_grid.data))
--> 313                 .reshape(self.map_x, self.map_y)), 1) / 255
    314 
    315     def reset(self):

ValueError: cannot reshape array of size 128 into shape (32,32)

Other users are encountering similar problems when using python-sc2: Dentosal/python-sc2#283

Things were working well until StarCraft2 version 4.8.4.
In 4.8.5, the pixelmap changed and self.state.upgrades stopped working. Thus, self.already_pending_upgrade(UPGRADEID) stopped working if the upgrade was completely researched (should return 1, but returns 0 instead).
self.state.upgrades was fixed in SC2 version 4.9.0.

The master branch of this library is working for SC2 version 4.8.4, while the develop branch fixed the issues with pixelmap introduced in 4.8.5 (and is the same in 4.8.6). Pixelmaps were starting in the top left but are now starting in the bottom left, so they were flipped in y-axis. Also pixelmap values changed from bytes to bits (like pathing grid, placement grid, and in 4.9.0 also creep map).

competitive scenario

Does SMAC support competitive MARL? That is, I want to implement two MARL algorithms to compete with each other, rather than the user agents compete with build-in AI. If yes, is there any guide on how to modify the code? Thanks!

Send state features as structured data

Hi. I have been working with SMAC for the past week or so. I noticed it always returns states and observations as a single vector. I understand that, for the purposes of RL, this makes sense, as an agent only needs to tell states apart in order to build a Q table. Also, since SMAC is supposed to be used as a benchmark, it is important that every algorithm receive the same information.
However, for my research, I find myself in need of state variables in a more structured way, so I can decide how to use them later. For example, I would like to receive features for each agent and enemy separately in a dictionary and also know what each feature is. This would allow me (and other SMAC users) to build their own state representations.
To that end, I created a couple of methods inside SMAC.

One returns states as dictionaries (ally and enemy features in matrices, lists describing what each column of each matrix stores and lists containing unit types). You can see it in commit douglasrizzo/smac@235bea1.
Another method returns a visibility matrix, i.e. which agents see which units. This is on commit douglasrizzo/smac@e979e33.

I developed both methods in separate branches, so I could easily create separate pull requests. But first I'd like to know of this would be a contribution you'd be interested in.

How to add the pysc2 built-in map to this framework?

Hi~

I found SMAC maps are not very difficult to solve, is it possible to add pysc2 maps to this framework?

How to judge win or lose

Hi, I'm wondering how to judge win or lose.

I judged it by comparing the cumulative rewards with 19.9，it says, if cumulative rewards > 19.9, I think it wins，is it right?

Because I find the cumulative rewards is sometimes like 19.9999999, so I don't use 20.

Looking forward to your reply.

Will you add mini-games ?

mini-games from pysc2 are also good and challenging, will you add them into SMAC in the future?

graphing smac_run_data.json data - 3s5z data does not match graph from paper

I have not been able to reproduce the graphs based on the .json file available. This is my first time working with json file types so I may be missing something obvious.

In the code, I read in smac_run_data.json and pull out the 5 runs based on a specific map and method. In this case, I use "QMIX" and "3s5z". To get the mean, I use np.mean on the 5 runs for their respective time steps. I then calculate the standard deviation using np.std and multiply by 2 to find my bounds for shading. I am using the 2*std as a ~95% confidence interval.

How did you graph smac_run_data.json?

Here is the code I wrote to graph it.

import pandas as pd
import numpy as np
df = pd.read_json('smac_run_data.json', orient='columns')
# display maps and algorithms available
# print(df.head(10))

# select map and algorithm
row = 'QMIX'
column = '3s5z'
df1 = df.loc[row,column]
df1 = pd.DataFrame(df1)
# select run and mean type
df2 = df1.iloc[[0],[0]]
# other runs can be selected using: 
    # df2 = df1.iloc[[Run_1,Run_2,Run_3,Run_4,Run_5],[test_battle_won_mean, test_return_mean]]
# convert into numpy
df3=pd.DataFrame(df2).to_numpy()
# select the data from the list
df4 = df3[0,0]
# convert the data into numpy workable format
# not sure why this is needed to be done twice
df5=np.asarray(df4)
# select the win ratio for the selected run
df6=df5[:,1]
# pick up the time step indexes
times = df5[:,0]
# create dummy vector to initialize 
length_test=df6.shape
zero_hold = np.zeros(length_test)

for i in range(0,5):
    df2 = df1.iloc[[i],[0]]
    # other runs can be selected using: 
        # df2 = df1.iloc[[Run_1,Run_2,Run_3,Run_4,Run_5],[test_battle_won_mean, test_return_mean]]
    # convert into numpy
    df3=pd.DataFrame(df2).to_numpy()
    # select the data from the list
    df4 = df3[0,0]
    # convert the data into numpy workable format
    # not sure why this is needed to be done twice
    df5=np.asarray(df4)
    # select the win ratio for the selected run
    df6=df5[:,1]

    length_test=df6.shape

    zero_hold = np.vstack((zero_hold,df6))

# delete the zeros place holder array
battle_5_runs = zero_hold[1::]

# calculate standard deviation and 
battle_std=np.std(battle_5_runs,axis=0)*2 # using 2 standard deviation to get relatively close to 95% confidence
battle_mean=np.mean(battle_5_runs,axis=0)
lower_std=battle_mean+battle_std
upper_std=battle_mean-battle_std
np.max(times)


# https://htmlcolorcodes.com/
import matplotlib.pyplot as plt

plt.plot(times, battle_mean, color='#f5b041')
# plt.plot(times, battle_5_runs[0], color='blue') # sanity checking that runs have data
plt.axis([0, np.max(times), 0, 1])
plt.fill_between(times, lower_std, upper_std, facecolor='#f5b041', alpha=0.3)
plt.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
plt.xlabel('T')
plt.ylabel('Test Win Rate')

plt.show()

edit: adding graphs for 3s5z using QMIX
this is the one I generated:

this is the one listed in the QMIX publication (Figure 6):

Simplify dense reward calculation if reward_only_positive is true

In this part of the dense reward calculation method, the damage and deaths of ally units are accumulated into variables delta_ally and delta_deathsand used to compose the reward later. Notice how dealth_deaths is only changed if self.reward_only_positive is false:

smac/smac/env/starcraft2/starcraft2.py

Lines 684 to 701 in a185b70

    
           for al_id, al_unit in self.agents.items(): 
        
               if not self.death_tracker_ally[al_id]: 
        
                   # did not die so far 
        
                   prev_health = ( 
        
                       self.previous_ally_units[al_id].health 
        
                       + self.previous_ally_units[al_id].shield 
        
                   ) 
        
                   if al_unit.health == 0: 
        
                       # just died 
        
                       self.death_tracker_ally[al_id] = 1 
        
                       if not self.reward_only_positive: 
        
                           delta_deaths -= self.reward_death_value * neg_scale 
        
                       delta_ally += prev_health * neg_scale 
        
                   else: 
        
                       # still alive 
        
                       delta_ally += neg_scale * ( 
        
                           prev_health - al_unit.health - al_unit.shield 
        
                       )

When the reward is calculated using the previous accumulated values, delta_ally is only used if self.reward_only_positive is false. The version of delta_deaths that is altered in the ally loop above is also only used if self.reward_only_positive is false.

smac/smac/env/starcraft2/starcraft2.py

Lines 716 to 719 in a185b70

    
           if self.reward_only_positive: 
        
               reward = abs(delta_enemy + delta_deaths)  # shield regeneration 
        
           else: 
        
               reward = delta_enemy + delta_deaths - delta_ally

This makes me conclude that we only need to process ally units in this method if self.reward_only_positive is false, otherwise we can ignore the first loop. I don't know how much this would affect performance (this is a method that runs on every game step, after all) but I could come up with this simplified version. I'd just like others to validate if what I said is true.

Reward range

Hi, may I ask the reward range of this 3m map environment?

Disable the text messages

Hi, I want to disable the text messages(eg "KillUnitBtTag 46") because I need to show the performence of my algorithm by video. Can you tell me how to do this?

Scenarios 3s_vs_3z and 3s_vs_4z Difficulty Category

Hi all,

Is there a place where I can find the difficulty category for all scenarios?

I noticed 3s_vs_5z was counted as a hard scenario in SMAC paper. How about 3s_vs_3z or 3s_vs_4z?

Thanks!

Play the replay failed: Could not open initData for the replay

I am using https://github.com/oxwhirl/pymarl to run my experiments and save some replay files. After get the replay files, when using

python -m pysc2.bin.play --render --rgb_minimap_size 0 --replay 3m_2020-07-24-05-34-35.SC2Replay

to watch the replay, it failed and returned the following messages.

I0724 18:53:20.919057 173520 remote_controller.py:163] Connecting to: ws://127.0.0.1:23181/sc2api, attempt: 19, running: True
I0724 18:53:23.921703 173520 remote_controller.py:163] Connecting to: ws://127.0.0.1:23181/sc2api, attempt: 20, running: True
I0724 18:53:26.929005 173520 remote_controller.py:163] Connecting to: ws://127.0.0.1:23181/sc2api, attempt: 21, running: True
I0724 18:53:32.214189 173520 sc_process.py:201] Shutdown gracefully.
I0724 18:53:32.214189 173520 sc_process.py:182] Shutdown with return code: 0
Traceback (most recent call last):
  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\pysc2\bin\play.py", line 218, in <module>
    app.run(main)
  File "C:\Users\me\AppData\Roaming\Python\Python36\site-packages\absl\app.py", line 300, in run
    _run_main(main, args)
  File "C:\Users\me\AppData\Roaming\Python\Python36\site-packages\absl\app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\pysc2\bin\play.py", line 156, in main
    info = controller.replay_info(replay_data)
  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\pysc2\lib\remote_controller.py", line 71, in _check_error
    return check_error(func(*args, **kwargs), error_enum)
  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\pysc2\lib\remote_controller.py", line 62, in check_error
    raise RequestError("%s.%s: '%s'" % (enum_name, error_name, details), res)
pysc2.lib.remote_controller.RequestError: SC2APIProtocol.ResponseReplayInfo.Error.ParsingError: 'Could not open initData for the replay: C:\Users\me\AppData\Local\Temp\StarCraft II\TempReplayInfo.SC2Replay'

The version are SMAC==1.0 and PySC2==3.0 and OS==ubuntu, Why using the replay generated by the training process failed? Are there something wrong with the Linux version of PySC2?

Have problem creating new maps

Hi,

When using starcraft II map editor to load your map (e.g. 3m.SC2MAP), I only get a empty map with TEAM1 and TEAM2 flag on each side. Wonder if i miss something.

Best.
C

Not able to run random_agents with PySC2 3.0.0

Hi,
I can not run the base random_agents example after following exact guidelines with PySC2 3.0.0. With version 2.0.2, however, it works fine. The following error is thrown with 3.0.0.

Traceback (most recent call last):
File "/home/patrick/Projects/sc2-project-thesis/main_smac.py", line 39, in
main()
File "/home/patrick/Projects/sc2-project-thesis/main_smac.py", line 15, in main
env.reset()
File "/home/patrick/miniconda3/envs/sc2-project-thesis/lib/python3.7/site-packages/smac/env/starcraft2/starcraft2.py", line 338, in reset
self._launch()
File "/home/patrick/miniconda3/envs/sc2-project-thesis/lib/python3.7/site-packages/smac/env/starcraft2/starcraft2.py", line 288, in _launch
window_size=self.window_size)
File "/home/patrick/miniconda3/envs/sc2-project-thesis/lib/python3.7/site-packages/pysc2/run_configs/platforms.py", line 205, in start
want_rgb=want_rgb, extra_args=extra_args, **kwargs)
File "/home/patrick/miniconda3/envs/sc2-project-thesis/lib/python3.7/site-packages/pysc2/run_configs/platforms.py", line 88, in start
self, exec_path=exec_path, version=self.version, **kwargs)
TypeError: type object got multiple values for keyword argument 'version'

Question on the index of ally_feats

In https://github.com/oxwhirl/smac/blob/master/smac/env/starcraft2/starcraft2.py#L957,

I found the ally_feats uses the index of i not al_id to set values, is it right? Since the i is from enumerate(al_ids). I think al_id should be used rather than i.

@douglasrizzo @samvelyan Could you please check it?

Run-time spawn support

With the help of SC2 Editor, we can create units during one episode of the game. It seems smac only supports static spawn currently. I mean we must specify the number of agents and the number of enemies in the map configuration like this:

"3m": {
        "n_agents": 3,
        "n_enemies": 3,
        "limit": 60,
        "a_race": "T",
        "b_race": "T",
        "unit_type_bits": 0,
        "map_type": "marines",
    }

And we have to spawn all the units for an episode at the very beginning.

Is it possible to create some interfaces to handle the dynamic spawn units during the game?

The meaning of move_feats、enemy_feats、ally_feats、own_feats in starcraft2.py

Hi, thanks for this repo.
I have been reading the source code of smac these days and I have a question. What's the meaning of these four variables move_feats、enemy_feats、ally_feats、own_feats in the file smac/smac/env/starcraft2/starcraft2.py ? And also, what's their function?
（I failed to find annotations about them.)
Thanks a lot!

A question about effective shooting

Hello. In fact, I try to train the agents under the scenario 2m_vs_1z but I found that even the zealot is in the shooting range of our agents and it's attacked, its health keeps the same as before. I think the health of zealot should reduce but it doesn't. So I wonder why the attack isn't effective as I imagined. I guess maybe there is an effective attack ratio or the zealot can dodge. Could you please explain this? Thanks in advance.

How can I get the reward of each agent at each step.

Dear @douglasrizzo, do you know how to get the reward of each agent at each step? Now, SMAC only returns a global reward. How can I change the setting to make SMAC return each agent's individual reward?

Are the state transition function and reward function stochastic?

Hi, I'm wordering about the dynamic of this environment, are the state transition function and reward function stochastic?

Looking forward to your reply!

Description of the level of built-in AI

As is described in the documentation, the built-in AI is set to level 7 -"very difficult". I guess AI of level 7 is more intelligent than lower level but I don't quite know the concrete settings of this level. Maybe the enemies of level-7 can share their observations while the enemies of lower levels can't, I guess. Could someone give the description of the specific settings of every level?

RLLIB QMIX example does not work

There appears to be a problem when using a masked action space with the QMIX algorithm. I think the qmix_policy_graph expects there to be at least one valid action at all times.

Full traceback is below:

  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 446, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 316, in fetch_result
    result = ray.get(trial_future[0])
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/worker.py", line 2197, in get
    raise value
ray.exceptions.RayTaskError: ^[[36mray_QMixTrainer:train()^[[39m (pid=25398, host=cassini)
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 354, in train
    raise e
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 340, in train
    result = Trainable.train(self)
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/tune/trainable.py", line 151, in train
    result = self._train()
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/agents/dqn/dqn.py", line 242, in _train
    self.optimizer.step()
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/optimizers/sync_batch_replay_optimizer.py", line 84, in step
    return self._optimize()
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/optimizers/sync_batch_replay_optimizer.py", line 108, in _optimize
    info_dict = self.local_evaluator.learn_on_batch(samples)
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 581, in learn_on_batch
    info_out[pid] = policy.learn_on_batch(batch)
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/agents/qmix/qmix_policy_graph.py", line 296, in learn_on_batch
    next_obs, action_mask, next_action_mask)
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/johnson/miniconda3/envs/rlenv/lib/python3.6/site-packages/ray/rllib/agents/qmix/qmix_policy_graph.py", line 108, in forward
    there may be a state with no valid actions."
AssertionError: target_max_qvals contains a masked action;             there may be a state with no valid actions.```

SMAC state space

Thanks for this resource! Is there any way to access the the enemies' last actions as part of the state space? I'm considering a game theoretic approach to training an agent but would need this ground truth of player actions for both allies and enemies.

Also, I was able to decipher the current state space only by digging through your code to see how the array gets populated. It would be nice to have more thorough documentation about the SMAC environment's state, action, observation, and reward spaces.

	for al_id, al_unit in self.agents.items():
	if not self.death_tracker_ally[al_id]:
	# did not die so far
	prev_health = (
	self.previous_ally_units[al_id].health
	+ self.previous_ally_units[al_id].shield
	)
	if al_unit.health == 0:
	# just died
	self.death_tracker_ally[al_id] = 1
	if not self.reward_only_positive:
	delta_deaths -= self.reward_death_value * neg_scale
	delta_ally += prev_health * neg_scale
	else:
	# still alive
	delta_ally += neg_scale * (
	prev_health - al_unit.health - al_unit.shield
	)

	if self.reward_only_positive:
	reward = abs(delta_enemy + delta_deaths) # shield regeneration
	else:
	reward = delta_enemy + delta_deaths - delta_ally