Coder Social home page Coder Social logo

light_mappo's Introduction

light_mappo

Lightweight version of MAPPO to help you quickly migrate to your local environment.

Table of Contents

Background

The original MAPPO code was too complex in terms of environment encapsulation, so this project directly extracts and encapsulates the environment. This makes it easier to transfer the MAPPO code to your own project.

Installation

Simply download the code, create a Conda environment, and then run the code, adding packages as needed. Specific packages will be added later.

Usage

import numpy as np
class EnvCore(object):
    """
    # Environment Agent
    """
    def __init__(self):
        self.agent_num = 2 # set the number of agents(aircrafts), here set to two
        self.obs_dim = 14 # set the observation dimension of agents
        self.action_dim = 5 # set the action dimension of agents, here set to a five-dimensional

    def reset(self):
        """
        # When self.agent_num is set to 2 agents, the return value is a list, and each list contains observation data of shape = (self.obs_dim,)
        """
        sub_agent_obs = []
        for i in range(self.agent_num):
            sub_obs = np.random.random(size=(14, ))
            sub_agent_obs.append(sub_obs)
        return sub_agent_obs

    def step(self, actions):
        """
        # When self.agent_num is set to 2 agents, the input of actions is a two-dimensional list, and each list contains action data of shape = (self.action_dim,).
        # By default, the input is a list containing two elements, because the action dimension is 5, so each element has a shape of (5,)
        """
        sub_agent_obs = []
        sub_agent_reward = []
        sub_agent_done = []
        sub_agent_info = []
        for i in range(self.agent_num):
            sub_agent_obs.append(np.random.random(size=(14,)))
            sub_agent_reward.append([np.random.rand()])
            sub_agent_done.append(False)
            sub_agent_info.append({})

        return [sub_agent_obs, sub_agent_reward, sub_agent_done, sub_agent_info]

Just write this part of the code, and you can seamlessly connect with MAPPO. After env_core.py, two files, env_discrete.py and env_continuous.py, were separately extracted to encapsulate the action space and discrete action space. In elif self.continuous_action: in algorithms/utils/act.py, this judgment logic is also used to handle continuous action spaces. The # TODO here in runner/shared/env_runner.py is also used to handle continuous action spaces.

In the train.py file, choose to comment out continuous environment or discrete environment to switch the demo environment.

Related Efforts

  • on-policy - 💌 Learn the author implementation of MAPPO.

Maintainers

@tinyzqh.

Translator

@tianyu-z

License

MIT © tinyzqh

light_mappo's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

light_mappo's Issues

How to set continuous action

I want to use continuous actions, but an error is reported after setting the self.discrete_action_space in the environment to false.

VecEnvWrapper使用

There is the mistake in the env_wrappers.py that is the VecEnvWrappercan is the unresolved reference.

训练效果查看

请问训练结束后,得到logs 和 models 怎么查看和使用?log
image
s使用tensoboard进行查看吗, 怎么来加载模型测试来查看效果呢?

MAPPO-L

Thanks very much for your codes.

Have you considered to extend it into other variants of MAPPO, such as MAPPO-L?

一次回合结束时重置环境导致obs发生变化

在env_wrappers.py中,step_wait()的"obs[i] = self.envs[i].reset()"判断episode是否结束,这里将reset之后的观测值传给了obs[i],导致episode结束的那一刻的obs被覆盖。这样赋值是否不妥?因为reset之后的obs可以认为是随机的,不应该将其赋给obs[i],而应该直接调用"self.envs[i].reset()"?

Action mask?

您好!如果agent的动作维度不一致时,light-mappo如何进行action mask?

加入自己的环境,使用env_continuous时碰到的问题

在自己修改代码后,选择的是continuous env,智能体separated policy更新action,但是env_runner.py中的 collect 函数这里只有MultiDiscrete 和Discrete两个选项,没有Box选项,请问这个情况要怎么处理?感谢!

image

env

給出的范例只有 sub-agnet_obs ,这里是不是没有特别区分观测信息与全局状态信息? 这里的 sub_agent_obs 就是指智能体的部分观测信息的列表吗?那这样全局信息是怎么处理的呢,就是把部分观测信息的融合作为全局信息?

选use_eval的时候运行报错NotImplementedError

是不是连续动作空间的环境不能用eval

Traceback (most recent call last):
File "G:\lcz\mappo\train\train.py", line 149, in
main(sys.argv[1:])
File "G:\lcz\mappo\train\train.py", line 137, in main
runner.run()
File "G:\lcz\mappo\runner\shared\env_runner.py", line 88, in run
self.eval(total_num_steps)
File "C:\Users\ljh99\anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "G:\lcz\mappo\runner\shared\env_runner.py", line 183, in eval
raise NotImplementedError
NotImplementedError

share_policy置False时出错

报错定位于runner/separated/env_runner.py中的collcet函数中的
actions = np.array(actions).transpose(1, 0, 2)
初步排查发现在这句代码上面的循环中,当agent_id取1时,生成的动作的shape与agent_id取0时不同

回合结束后没有reset环境

RT,runner在训练的时候,如果环境给出了done,runner没有进行reset,在某些环境中可能会导致不收敛

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.