anujmahajanoxf / maven Goto Github PK

Submission for MAVEN: Multi-Agent Variational Exploration

Dockerfile 0.79% Shell 0.96% Python 98.26%

maven's Issues

TypeError: type object got multiple values for keyword argument 'version'

I pip installed the dependencies fromrequirements.txt of PYMARL on python3.6 and I don't think there is a problem with dependencies, but I keep receiving the following error when I run the experiment python src/main.py --config=noisemix_episode --env-config=sc2 with env_args.map_name=3s5z

Traceback (most recent call last):
  File "/usr/lib64/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib64/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mop216/MAVEN/maven_code/src/runners/parallel_runner.py", line 311, in env_worker
    env = env_fn.x()
  File "/home/mop216/MAVEN/maven_code/src/envs/__init__.py", line 7, in env_fn
    return env(**kwargs)
  File "/home/mop216/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 142, in __init__
    self._launch()
  File "/home/mop216/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 206, in _launch
    self._sc2_proc = self._run_config.start(version=self.game_version, window_size=self.window_size)
  File "/home/mop216/MAVENN/lib64/python3.6/site-packages/pysc2/run_configs/platforms.py", line 205, in start
    want_rgb=want_rgb, extra_args=extra_args, **kwargs)
  File "/home/mop216/MAVENN/lib64/python3.6/site-packages/pysc2/run_configs/platforms.py", line 88, in start
    self, exec_path=exec_path, version=self.version, **kwargs)
TypeError: type object got multiple values for keyword argument 'version'

Please note that --config=noise_qmix_parallel --env-config=nmatrix configuration runs properly and uses nmatrix environment. I think there goes something wrong with the starcraft2.py which I cannot figure it out.

Help for the configuration of nstep_matrix game.

Hello,

Can you share MAVEN's .ymal file for the n_step matrix in the paper?

Some parameters are missing in the plot_keys/keys_matrix_games and I'm not sure my current setting is the same as yours.

Thank you very much!

How can I reproduce the results of Corridor

Hi, which config and command should I use to train the results of Corridor?

Could you please give a baseline model so that we could see the result intuitively?

Hi,
I just setup the environment and everything worked fine! Thanks for the great work~

But I wonder if you could provide us with the baseline model so we could use it without retraining. Because the training process on those super hard maps takes a really long time.

Thanks again and I am glad for your reply~

The running configurations in the paper

Hello,

Thanks for your codes. I'm running the codes with noisemix_smac.yaml in algs and sc2.yaml in envs.

However, it is hard to reproduce the MAVEN's results in the corridor and 6h_vs_8z.

Could you please provide the running configurations of MAVEN in your paper.

Thanks!

Gradient error on noisemix

I am trying to reproduce your results and I have an issue running the code (not running on a docker). The backpropagation fails with the following error:

[ERROR 13:59:02] pymarl Failed after 0:00:23!
Traceback (most recent calls WITHOUT Sacred internals):
  File "src/main.py", line 34, in my_main
    run(_run, _config, _log)
  File "/home/USER/.tmp/MAVEN/maven_code/src/run.py", line 48, in run
    run_sequential(args=args, logger=logger)
  File "/home/USER/.tmp/MAVEN/maven_code/src/run.py", line 181, in run_sequential
    learner.train(episode_sample, runner.t_env, episode)
  File "/home/USER/.tmp/MAVEN/maven_code/src/learners/noise_q_learner.py", line 168, in train
    loss.backward()
  File "/home/USER/.general_env/lib/python3.8/site-packages/torch/tensor.py", line 195, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/USER/.general_env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 97, in backward
    Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 56, 3, 9]], which is output 0 of SliceBackward, is at version 2; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

During handling of the above exception, another exception occurred:

Traceback (most recent calls WITHOUT Sacred internals):
  File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/lib/python3.8/subprocess.py", line 1079, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.8/subprocess.py", line 1796, in _wait
    raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['tee', '-a', '/tmp/tmp8fr28bt_']' timed out after 1 seconds

I have spent some time searching for the error and came up with the following:

In the noise QLearner, the latest version of PyMARL shows that the target_max_qvals should be computed from a detached version of mac_out in the train function, section "Max over target Q-Values"
Here I am not sure at all, but computing q_softmax_actions with the detached mac_out removes the backprop error. However, I don't know if the MI loss is correctly backpropagated afterwards.

This is what I changed to have an up and running code, in the train function of src/learners/noise_q_learner.py:

        # Max over target Q-Values
        if self.args.double_q:
            mac_out_detach = mac_out.clone().detach()
            mac_out_detach[avail_actions == 0] = -9999999
            cur_max_actions = mac_out_detach[:, 1:].max(dim=3, keepdim=True)[1]
            target_max_qvals = torch.gather(
                target_mac_out, 3, cur_max_actions
            ).squeeze(3)
            # Get actions that maximise live Q (for double q-learning)
            #mac_out[avail_actions == 0] = -9999999
            #cur_max_actions = mac_out[:, 1:].max(dim=3, keepdim=True)[1]
            #target_max_qvals = th.gather(target_mac_out, 3, cur_max_actions).squeeze(3)
        else:
            target_max_qvals = target_mac_out.max(dim=3)[0]

        # Discriminator
        mac_out_detach = mac_out.clone().detach()
        mac_out_detach[avail_actions == 0] = -9999999
        q_softmax_actions = torch.nn.functional.softmax(
            mac_out_detach[:, :-1], dim=3
        )
        #mac_out[avail_actions == 0] = -9999999
        #q_softmax_actions = th.nn.functional.softmax(mac_out[:, :-1], dim=3)

Can you tell me if these changes are ok and if not how the gradient propagation should be fixed? I assume the proper way to fix the discriminator backprop problem would be to remove the part of the target that is in line with unavailable actions. I will keep looking for that in the code but I would really like to have your input.

Docker setup

Thanks for the very inspiring paper and code. Following your suggestion, I tried to run your code in the provided docker image. However, when I run
bash run.sh $GPU python3 src/main.py --config=noisemix_episode --env-config=sc2 with env_args.map_name=3s5z,
I get error:
Launching container named 'iliu3_python3_1204202011_0_SlUA' on GPU '0' standard_init_linux.go:207: exec user process caused "exec format error".

I am able to run the code of pymarl (https://github.com/oxwhirl/pymarl) repo with the docker image they provided.

Thanks.

Training 6h_vs_8z failed

Hi, I tried to run the code on 6h_vs_8z and failed. It shows something wrong in the starcraft2.py. And I also found 6h_vs_8z is not a registered game in your code. Could you please help me to fix it?

PySC2==3.0

Traceback (most recent call last):
  File "/home/me/anaconda3/envs/sc_ray/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/me/anaconda3/envs/sc_ray/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/me/maven/maven_code/src/runners/parallel_runner.py", line 333, in env_worker
    env.reset()
  File "/home/me/maven/maven_code/src/envs/starcraft2/starcraft2.py", line 270, in reset
    return self.get_obs(), self.get_state()
  File "/home/me/maven/maven_code/src/envs/starcraft2/starcraft2.py", line 878, in get_state
    ally_state[al_id, 1] = al_unit.weapon_cooldown / max_cd # cooldown
TypeError: unsupported operand type(s) for /: 'float' and 'NoneType'

run error

Hi, could you please help me to solve the problem? It's happened when I test code 'python3 src/main.py --config=noisemix_episode --env-config=sc2 with env_args.map_name=3s5z'

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 132, 8, 14]], which is output 0 of SliceBackward, is at version 2; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

the code of 4step and 3step map

May I know the code of environment "3step" and "4step" ? I only find map file here but I do not know the reward and environment configuration of both map in neither paper or code.

Error when loading environment

Hi, I saw the following error when the code runs:

"Process Process-1:
Traceback (most recent call last):
File "/home/jovyan/miniconda3/envs/pytorch/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/jovyan/miniconda3/envs/pytorch/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/MAVEN/maven_code/src/runners/parallel_runner.py", line 311, in env_worker
env = env_fn.x()
File "/home/jovyan/MAVEN/maven_code/src/envs/init.py", line 7, in env_fn
return env(**kwargs)
File "/home/jovyan/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 142, in init
self._launch()
File "/home/jovyan/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 206, in _launch
self._sc2_proc = self._run_config.start(version=self.game_version, window_size=self.window_size)
File "/home/jovyan/miniconda3/envs/pytorch/lib/python3.6/site-packages/pysc2/run_configs/platforms.py", line 205, in start
want_rgb=want_rgb, extra_args=extra_args, **kwargs)
File "/home/jovyan/miniconda3/envs/pytorch/lib/python3.6/site-packages/pysc2/run_configs/platforms.py", line 88, in start
self, exec_path=exec_path, version=self.version, **kwargs)
TypeError: type object got multiple values for keyword argument 'version'"

Do you have any idea on this? Also can you share your "requirement.txt" so I can know which version of the tool I should use? Many thanks!

anujmahajanoxf / maven Goto Github PK

maven's Issues

TypeError: type object got multiple values for keyword argument 'version'

Help for the configuration of nstep_matrix game.

How can I reproduce the results of Corridor

Could you please give a baseline model so that we could see the result intuitively?

The running configurations in the paper

Gradient error on noisemix

Docker setup

Training 6h_vs_8z failed

run error

the code of 4step and 3step map

Error when loading environment

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent