Coder Social home page Coder Social logo

uni-o4's Introduction

Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization

ICLR 2024

Kun Lei · Zhengmao He* · Chenhao Lu* · Kaizhe Hu · Yang Gao · Huazhe Xu

Logo

Code Overview

We evaluate Uni-O4 on standard D4RL benchmarks during offline and online fine-tuning phases. In addition, we utilize Uni-O4 to enable rapid adaptation of our quadrupedal robot dog to new and challenging environments. This repo contains five branches:

  • master (default) -> Uni-O4
  • go1_sdk -> sdk set-up for go1 robot
  • data_collecting_deployment -> Deploying go1 in real-world for data collecting
  • unio4-offline-robot -> Run Uni-O4 on dataset collected dy real-world robot dog
  • go1-online-finetuning -> Fine-tuning the robot in real-world online

Clone each branch: git clone -b [Branch Name] https://github.com/Lei-Kun/Uni-O4.git

For D4RL benchmarks

Requirements

  • torch 1.12.0
  • mujoco 2.2.1
  • mujoco-py 2.1.2.14
  • d4rl 1.1

To install all the required dependencies:

  1. Install MuJoCo from here.
  2. Install Python packages listed in requirements.txt using pip install -r requirements.txt. You should specify the version of mujoco-py in requirements.txt depending on the version of MuJoCo engine you have installed.
  3. Manually download and install d4rl package from here.

Running the code

  • main.py: trains the network, storing checkpoints along the way. Other domain set-up comming soon.
  • Example - for offline pre-training:
./scripts/mujoco_loco/hm.sh
  • Example - for online fine-tuning:
./ppo_finetune/scripts/mujoco_loco/hm.sh

Real-world tasks set-up

See INSTALL.md for installation instructions.

For real-world adaptation tasks involving quadrupedal robots, our approach involves a three-step process. Firstly, we pre-train a policy in a simulator, which takes several minutes to complete. Then, we proceed with fine-tuning the policy in the real-world environment, both offline and online, utilizing the uni-o4 algorithm.

  1. Pretrining in Issacgym:
cd ./unio4-offline-robot
pip install -e .
cd ./scripts
python train.py
  1. Fine-tuning by uni-o4 offline - collecting data (build sdk follows INSTALL.md):

1)Start up go1 sdk:

cd ./go1_sdk/build
./lcm_position

2)Run:

cd ./data_collecting_deployment
pip install -e .
cd ./data_collecting_deployment/go1_gym_deploy/scripts
python deploy_policy --deploy_policy 'sim'
'sim' -> pretrained policy in simulator
'offline' -> offline fine-tuned policy in real-world
'online' -> online fine-tuned policy in real-world
  1. Fine-tuning by uni-o4 offline - run uni-o4 on collected dataset:
copy dataset to unio4-offline-robot
cd ./unio4-offline-robot
./run.sh
  1. Fine-tuning by PPO online:
cd ./go1_sdk/build
./lcm_position
cd ./go1-online-finetuning
python off2on.py

Citation

If you use Uni-O4, please cite our paper as follows:

@inproceedings{
lei2024unio,
title={Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization},
author={Kun LEI and Zhengmao He and Chenhao Lu and Kaizhe Hu and Yang Gao and Huazhe Xu},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=tbFBh3LMKi}
}

uni-o4's People

Contributors

lei-kun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

uni-o4's Issues

Config file for the Antmaze transition model

Hello, thank you for your great work and for providing the code!

I wanted to try running the code myself, especially in Antmaze. I found the config files for the transition models for Gym and Adroit, but it seems like the AntMaze config files are missing. Could you possibly provide the config files for the Antmaze environment as well?

Thank you in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.