Coder Social home page Coder Social logo

forexar86 / overcookedgpt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bladetransformerllc/overcookedgpt

0.0 0.0 0.0 2.59 MB

An OpenAI gym environment to evaluate the ability of large language models (LLMs; eg. GPT-4) in long-horizon reasoning and task planning in dynamic multi-agent settings.

License: MIT License

Python 100.00%

overcookedgpt's Introduction

OvercookedGPT (WIP)

An OpenAI gym environment to evaluate the ability of large language models (LLMs; eg. GPT-4) in long-horizon reasoning and task planning in dynamic multi-agent settings based on gym-cooking [1].

Watch the video
https://www.youtube.com/watch?v=4LmcpkS53Wg

Introduction

There is a new area of AI research where foundation models such as LLMs are used for decision making in complex environments that involve long-horizon reasoning, control, and planning [2]. For instance, Text2Motion [3] enables robots to solve sequential manipulation tasks by using LLMs. Also OpenAI's GPT-4 performs well in theory-of-mind (ToM) tasks [6], which require understanding other agents' beliefs, goals, and mental states.

OvercookedGPT is an interactive 2D game environment where OpenAI's GPT-4/3.5-Turbo generates intertemporal and sequential tasks in a centralized fashion to control multiple agents to achieve a goal in a simulation (i.e., to cook food at a kitchen). It is based on gym-cooking [1] and was also inspired by overcooked_ai [4] (which is used in [5]). The purpose of this simulator is to evaluate the ability of the LLMs in long-horizon reasoning and task planning in dynamic multi-agent environments. To this end, in-context learning (i.e., few-shot learning with prompt engineering methods of CoT and PAL [7]) is used to guide the LLMs to generate a task queue in Python that is executed by the simulator on the fly.

Installation

python3 -m pip install -U pygame --user
git clone https://github.com/BladeTransformerLLC/OvercookedGPT.git
cd OvercookedGPT
pip3 install -r requirements.txt

Set the OPENAI_API_KEY environment variable (alternatively put the key string in utils/chatgpt/openai.json)

Usage

Start a single-agent simulation (enter a task eg. "Make a tomato and lettuce salad and deliver it."):

python3 main.py --num-agents 1 --level partial-divider_salad --gpt

Start a multi-agent simulation:

python3 main.py --num-agents 2 --level partial-divider_salad --gpt

Mannually control agents with arrow keys (switch between agents by pressing 1 or 2):

python3 main.py --num-agents 2 --level partial-divider_salad --gpt --manual

ToDo

  • Allow simultaneous/parallel subtask execution by multiple agents within the same timestep
  • Prevent agents from moving through other agents (make them avoid/wait others)
  • Evaluate with 3 or more agents

References

  1. Wu et. al., "Too many cooks: Bayesian inference for coordinating multi-agent collaboration," 2020.
  2. Yang et. al., "Foundation Models for Decision Making: Problems, Methods, and Opportunities," 2023.
  3. Lin et. al., "Text2Motion: From Natural Language Instructions to Feasible Plans," 2023.
  4. Carroll et. al., "On the Utility of Learning about Humans for Human-AI Coordination," 2020.
  5. Hong et. al., "Learning to Influence Human Behavior with Offline Reinforcement Learning," 2023.
  6. Moghaddam & Honey, "Boosting Theory-of-Mind Performance in Large Language Models via Prompting," 2023.
  7. Gao et. al., "PAL: Program-aided Language Models," 2022

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.