Coder Social home page Coder Social logo

xugw-kevin / drm Goto Github PK

View Code? Open in Web Editor NEW
39.0 39.0 7.0 89.46 MB

DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.

License: MIT License

Python 78.42% Makefile 0.11% Dockerfile 0.02% Shell 0.05% Jupyter Notebook 21.39%

drm's Introduction

DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization

[Paper][Project Website]

This repository is the official PyTorch implementation of DrM. DrM, a visual reinforcement learning algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs and achieves remarkable significant sample efficiency and asymptotic performance in the hardest locomotion and manipulation tasks.



🛠️ Installation Instructions

First, create a virtual environment and install all required packages.

sudo apt update
sudo apt install libosmesa6-dev libegl1-mesa libgl1-mesa-glx libglfw3 
conda env create -f conda_env.yml 
conda activate drm
pip3 install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

Next, install the additional dependencies required for MetaWorld and Adroit.

cd metaworld
pip install -e .
cd ..
cd rrl-dependencies
pip install -e .
cd mj_envs
pip install -e .
cd ..
cd mjrl
pip install -e .

💻 Code Usage

If you would like to run DrM on DeepMind Control Suite, please use train_dmc.py to train DrM policies on different configs.

python train_dmc.py task=dog_walk agent=drm

If you would like to run DrM on MetaWorld, please use train_mw.py to train DrM policies on different configs.

python train_mw.py task=sweep-into agent=drm
python train_mw_sparse.py task=soccer agent=drm

If you would like to run DrM on Adroit, please use train_adroit.py to train DrM policies on different configs.

python train_adroit.py task=pen agent=drm_adroit

📝 Citation

If you use our method or code in your research, please consider citing the paper as follows:

@inproceedings{
drm,
title={DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization},
author={Guowei Xu, Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Zhecheng Yuan, Tianying Ji, Yu Luo, Xiaoyu Liu, Jiaxin Yuan, Pu Hua, Shuzhen Li, Yanjie Ze, Hal Daumé III, Furong Huang, Huazhe Xu.},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=MSe8YFbhUE}
}

🙏 Acknowledgement

DrM is licensed under the MIT license. MuJoCo and DeepMind Control Suite are licensed under the Apache 2.0 license. We would like to thank DrQ-v2 authors for open-sourcing the DrQv2 codebase. Our implementation builds on top of their repository.

drm's People

Contributors

cheryyunl avatar frankzheng2022 avatar xugw-kevin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

drm's Issues

DrM 在 Metaworld 环境下的性能问题

尊敬的作者:
您好!在复现 DrM 的过程中,我们发现了其在 DMC-hard 环境下的惊人表现;但是,我们发现在许多 Metaworld 环境下性能似乎表现不佳,表现在训练至 2m 步时其训练成成率仍然只有 2/10左右,这种现象出现在 stick-pull-v2, pick-place-wall-v2, hammer-v2 , 特别地,其在 disassemble-v2 下表现尤为不佳,训练 2m 之后成功率仍然只有 0/10 .
上述提到的每一个问题我都做了 5 个 seed 以确保问题的可复现性, 并且参数都使用了您所提供的默认参数,想请教这种性能不佳的情况在您实际运行的过程中是否有出现过?
祝好!
image

atari montezuma's revenge?

Has this been tested on Montezuma's revenge or Pitfall!, visual hard spare problems with discrete action space?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.