Coder Social home page Coder Social logo

yxzhan / gensim Goto Github PK

View Code? Open in Web Editor NEW

This project forked from liruiw/gensim

0.0 0.0 0.0 181.68 MB

GenSim: Generating Robotic Simulation Tasks via Large Language Models

Home Page: https://liruiw.github.io/gensim

License: MIT License

Python 100.00%

gensim's Introduction

GenSim: Generating Robotic Simulation Tasks via Large Language Models

Lirui Wang, Yiyang Ling, Zhecheng Yuan, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang

Project Page | Arxiv | Gradio Demo | Huggingface Dataset

This repo explores the use of an LLM code generation pipeline to write simulation environments and expert goals to augment diverse simulation tasks.

⚙️ Installation

  1. pip install -r requirements.txt
  2. python setup.py develop
  3. export GENSIM_ROOT=$(pwd)
  4. export OPENAI_KEY=YOUR KEY. We use OpenAI's GPT-4 as the language model. You need to have an OpenAI API key to run task generation with GenSim. You can get one from here.

🚶Getting Started

After the installation process, you can run:

# basic bottom-up prompt
python gensim/run_simulation.py disp=True prompt_folder=vanilla_task_generation_prompt_simple 

# bottom-up template generation
python gensim/run_simulation.py disp=True prompt_folder=bottomup_task_generation_prompt   save_memory=True load_memory=True  task_description_candidate_num=10 use_template=True

# top-down task generation
python gensim/run_simulation.py  disp=True  prompt_folder=topdown_task_generation_prompt save_memory=True load_memory=True task_description_candidate_num=10 use_template=True target_task_name="build-house"

# task-conditioned chain-of-thought generation
python gensim/run_simulation.py  disp=True  prompt_folder=topdown_chain_of_thought_prompt save_memory=True load_memory=True task_description_candidate_num=10 use_template=True target_task_name="build-car"  

💾 Add and remove task

  1. To remove a task (delete its code and remove it from the task and task code buffer), use python misc/purge_task.py -f color-sequenced-block-insertion
  2. To add a task (extract task description to add to buffer), use python misc/add_task_from_code.py -f ball_on_box_on_container

🤖 LLM Generated Task Usage

  1. All generated tasks in cliport/generated_tasks should have automatically been imported
  2. Set the task name and then use demo.py for visualization. For instance, python cliport/demos.py n=200 task=build-car mode=test disp=True.
  3. The following is a guide for training everything from scratch (More details in cliport). All tasks follow a 4-phase workflow:
    1. Generate train, val, test datasets with demos.py
    2. Train agents with train.py
    3. Run validation with eval.py to find the best checkpoint on val tasks and save *val-results.json
    4. Evaluate the best checkpoint in *val-results.json on test tasks with eval.py

🎛️ LLM Finetune

  1. Prepare data using python gensim/prepare_finetune_gpt.py. Released dataset is here

  2. Finetune using openai api openai api fine_tunes.create --training_file output/finetune_data_prepared.jsonl --model davinci --suffix 'GenSim'

  3. Evaluate it using python gensim/evaluate_finetune_model.py +target_task=build-car +target_model=davinci:ft-mit-cal:gensim-2023-08-06-16-00-56

  4. Compare with python gensim/run_simulation.py disp=True prompt_folder=topdown_task_generation_prompt_simple load_memory=True task_description_candidate_num=10 use_template=True target_task_name="build-house" gpt_model=gpt-3.5-turbo-16k trials=3

  5. Compare with python gensim/run_simulation.py disp=True prompt_folder=topdown_task_generation_prompt_simple_singleprompt load_memory=True task_description_candidate_num=10 target_task_name="build-house" gpt_model=gpt-3.5-turbo-16k

  6. turbo finetuned models. python gensim/evaluate_finetune_model.py +target_task=build-car +target_model=ft:gpt-3.5-turbo-0613: trials=3 disp=True

  7. Finetune Code-LLAMA using hugging-face transformer library here

  8. offline eval: python -m gensim.evaluate_finetune_model_offline model_output_dir=after_finetune_CodeLlama-13b-Instruct-hf_fewshot_False_epoch_10_0

✅ Note

  1. Temperature 0.5-0.8 is good range for diversity, 0.0-0.2 is for stable results.
  2. The generation pipeline will print out statistics regarding compilation, runtime, task design, and diversity scores. Note that these metric depend on the task compexity that LLM tries to generate.
  3. Core prompting and code generation scripts are in gensim and training and task scripts are in cliport.
  4. prompts/ folder stores different kinds of prompts to get the desired environments. Each folder contains a sequence of prompts as well as a meta_data file. prompts/data stores the base task library and the generated task library.
  5. The GPT-generated tasks are stored in generated_tasks/. Use demo.py to play with them. cliport/demos_gpt4.py is an all-in-one prompt script that can be converted into ipython notebook.
  6. Raw text outputs are saved in output/output_stats, figure results saved in output/output_figures, policy evaluation results are saved in output/cliport_output.
  7. To debug generated code, manually copy-paste generated_task.py then run python cliport/demos.py n=50 task=gen-task disp=True
  8. This version of cliport should support batchsize>1 and can run with more recent versions of pytorch and pytorch lightning.
  9. Please use Github issue tracker to report bugs. For other questions please contact Lirui Wang
  10. blender rendering python cliport/demos.py n=310 task=align-box-corner mode=test disp=True +record.blender_render=True record.save_video=True

Citation

If you find GenSim useful in your research, please consider citing:

@inproceedings{wang2023gen,
author    = {Lirui Wang, Yiyang Ling, Zhecheng Yuan, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang},
title     = {GenSim: Generating Robotic Simulation Tasks via Large Language Models},
booktitle = {Arxiv},
year      = {2023}
}

gensim's People

Contributors

liruiw avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.