Coder Social home page Coder Social logo

kajune / kodoku Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 93 KB

Multi-agent Self-Play Reinforcement Learning Library

License: MIT License

Python 100.00%
reinforcement-learning rllib self-play pytorch tensorflow multi-agent-reinforcement-learning

kodoku's Introduction

KODOKU

Multi-agent Reinforcement Learning Library

KODOKU is a framework-style wrapper library of rllib (https://github.com/ray-project/ray) to make it easier to implement complicated multi-agent training scheme.

simplebattlefield

Simple Multi-agent Training

Multi-agent trainining is included in KODOKUTrainer.

def config_fn():
	return \
		'default', \
		{
			"depth": 2.0,
			"width": 1.0,
			"atk_spawn_line": 1.5,
			"def_spawn_line": 0.5,
			"atk_num" : 3,
			"def_num" : 3,
			"unit_hp" : 1.0,
			"unit_power": 0.1,
			"unit_range": 0.1,
			"unit_speed": 0.05,
			"timelimit": 500,
		}


if __name__ == '__main__':
	trainer = KODOKUTrainer(
		log_dir='./log_dir', 
		env_class=SimpleBattlefieldEnv_Sym,
		train_config=json.load(open('train_config.json')),
		env_config_fn=config_fn,
	)

	trainer.train(10, epoch_callback=callback)
	trainer.evaluate()

An example is provided in sample/main.py.

Self-Play Training

Simple self-play is applied if policy_mapping_fn returns same policy for all agents.

def policy_mapping_fn(agent_id, episode, **kwargs):
	return "common"

Fictitious Self-Play Training (FSP)

Self-play can be easily implemented via PolicyMappingManager.

trainer = KODOKUTrainer(
	log_dir='./log_dir', 
	env_class=SimpleBattlefieldEnv_Sym,
	train_config=json.load(open('train_config.json')),
	env_config_fn=config_fn,
	# Three subpolicies for each policy
	policy_mapping_manager=SelfPlayManager(lambda agent: "blufor" if agent.startswith("atk") else "redfor", 3),
)

An example is provided in sample/main_sym.py.

You can extend FSP to PFSP or other variants by inheriting SelfPlayManager and override policy_selection.

Asymmetric Fictitious Self-play Training

FSP can be enforced even when the env is asymmetric.

An example is provided in sample/main_asym.py.

Win or Learn Fast (WoLF)

WoLF is a technique to stabilize asymmetric competitive multi-agent training by scaling learning rate based on payoff. In this framework, WoLF is realized via lr_schedule, however you can still use scheduler normally because existing scheduler will be wrapped by ScheduleScaler.

trainer = KODOKUTrainer(
	log_dir='./log_dir', 
	env_class=SimpleBattlefieldEnv_Asym,
	# Note: train_config may have lr_schedule as usual.
	train_config=json.load(open('train_config.json')),
	env_config_fn=config_fn,
	policy_mapping_manager=SelfPlayManager(
		agent_force_mapping_fn=lambda agent: "blufor" if agent.startswith("atk") else "redfor",
		wolf_fn=lambda reward: 0.25 if reward > 0 else 1.0)
)

An example is provided in sample/main_wolf.py.

kodoku's People

Contributors

kajune avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.