Coder Social home page Coder Social logo

br-mher's Introduction

Bias Resilient Multi-Step Off-Policy
Goal-Conditioned Reinforcement Learning

This repository is the official implementation of BR-MHER.

Requirements

To install requirements:

  • Install the requirements such as tensorflow, mpi4py and mujoco_py using pip, besides multi-world should be installed from this open-source multi-task benchmark environment repo https://github.com/vitchyr/multiworld;

  • Clone the repo and cd into it;

  • Install baselines package

    pip install -e .
  • Install the rest dependencies.

    pip install -r requirements.txt
    

Training

To train the model(s) in the paper, run this command:

  • Train MHER
python -m  baselines.run  --env=${TASK} --num_epoch 50 --num_env 6  --n_step ${NSTEP} --mode nstep  --log_path=logs/${TASK}/mher_n${NSTEP}/${SEED} --save_path=models/${TASK}/mher_n${NSTEP}/${SEED} --seed ${SEED} --n_test_rollouts 20 --Q_lr 0.001 --pi_lr 0.001 --tau 0.5 --delta 10 --use_huber True --truncate False --policy_delay 2 --noise_std 0
  • Train MHER($\lambda$)
python -m  baselines.run  --env=${TASK} --num_epoch 50 --num_env 6  --n_step ${NSTEP} --mode lambda --lamb 0.7  --log_path=logs/${TASK}/mher_n${NSTEP}/${SEED} --save_path=models/${TASK}/mher_n${NSTEP}/${SEED} --seed ${SEED} --n_test_rollouts 20 --Q_lr 0.001 --pi_lr 0.001 --tau 0.5 --delta 10 --use_huber True --truncate False --policy_delay 2 --noise_std 0
  • Train MMHER
python -m  baselines.run  --env=${TASK} --num_epoch 50 --num_env 6  --n_step ${NSTEP} --mode dynamic --alpha 0.5 --log_path=logs/${TASK}/mher_n${NSTEP}/${SEED} --save_path=models/${TASK}/mher_n${NSTEP}/${SEED} --seed ${SEED} --n_test_rollouts 20 --Q_lr 0.001 --pi_lr 0.001 --tau 0.5 --delta 10 --use_huber True --truncate False --policy_delay 2 --noise_std 0
  • Train TMHER
python -m  baselines.run  --env=${TASK} --num_epoch 50 --num_env 6  --n_step ${NSTEP} --mode nstep --log_path=logs/${TASK}/mher_n${NSTEP}/${SEED} --save_path=models/${TASK}/mher_n${NSTEP}/${SEED} --seed ${SEED} --n_test_rollouts 20 --Q_lr 0.001 --pi_lr 0.001 --tau 0.5 --delta 10 --use_huber True --truncate True --policy_delay 2 --noise_std 0
  • Train TMHER($\lambda$)
python -m  baselines.run  --env=${TASK} --num_epoch 50 --num_env 6  --n_step ${NSTEP} --mode lambda --lamb 0.7  --log_path=logs/${TASK}/mher_n${NSTEP}/${SEED} --save_path=models/${TASK}/mher_n${NSTEP}/${SEED} --seed ${SEED} --n_test_rollouts 20 --Q_lr 0.001 --pi_lr 0.001 --tau 0.5 --delta 10 --use_huber True --truncate True --policy_delay 2 --noise_std 0
  • Train QR-MHER
python -m  baselines.run  --env=${TASK} --num_epoch 50 --num_env 6  --n_step ${NSTEP} --mode lambda --lamb 0.7  --log_path=logs/${TASK}/mher_n${NSTEP}/${SEED} --save_path=models/${TASK}/mher_n${NSTEP}/${SEED} --seed ${SEED} --n_test_rollouts 20 --Q_lr 0.001 --pi_lr 0.001 --tau 0.75 --delta 10 --use_huber True --truncate False --policy_delay 2 --noise_std 0
  • Train BR-MHER
python -m  baselines.run  --env=${TASK} --num_epoch 50 --num_env 6  --n_step ${NSTEP} --mode lambda --lamb 0.7  --log_path=logs/${TASK}/mher_n${NSTEP}/${SEED} --save_path=models/${TASK}/mher_n${NSTEP}/${SEED} --seed ${SEED} --n_test_rollouts 20 --Q_lr 0.001 --pi_lr 0.001 --tau 0.75 --delta 10 --use_huber True --truncate True --policy_delay 2 --noise_std 0

NOTE tau controls the quantile level in the quantile regression.

Tasks

All the used tasks can be found in the file tasks.txt.

br-mher's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.