Coder Social home page Coder Social logo

safety_rl's Introduction

Safety and Liveness Guarantees through Reach-Avoid Reinforcement Learning

License Python 3.8

This repository implements a model-free reach-avoid reinforcement learning (RARL) to guarantee safety and liveness, and additionally contains example uses and benchmark evaluations of the proposed algorithm on a range of nonlinear systems. RARL is primarily developed by Kai-Chieh Hsu, a PhD student in the Safe Robotics Lab, and Vicenç Rubies-Royo, a postdoc in the Hybrid Systems Lab.

The repository also serves as the companion code to our RSS 2021 paper, where you can find the theoretical properties of the proposed algorithm as well as the implementation details. All experiments in the paper are included as examples in this repository, and you can replicate the results by using the commands described in Section II below. With some simple modification, you can replicate the results in the preceding ICRA 19 paper, which considers the special case of reachability/safety only.

This tool is designed to work for arbitrary reinforcement learning environments, and uses two scalar signals (a target margin and a safety margin) rather than a single scalar reward signal. You just need to add your environment under gym_reachability and register through the standard method in gym. You can refer to some examples provided here. This tool learns the reach-avoid set by trial-and-error interactions with the environment, so it is not in itself a safe learning algorithm. However, it can be used in conjunction with an existing safe learning scheme, such as "shielding", to enable learning with safety guarantees (see Script 4 below as well as Section IV.B in the RSS 2021 paper for an example).

The implementation of tabular Q-learning is adapted from Denny Britz's implementation and the implementation of double deep Q-network and replay memory is adapted from PyTorch's tutorial (by Adam Paszke).

I. Dependencies

If you are using anaconda to control packages, you can use one of the following command to create an identical environment with the specification file:

conda create --name <myenv> --file doc/spec-mac.txt
conda create --name <myenv> --file doc/spec-linux.txt

Otherwise, you can install the following packages manually:

  1. numpy=1.21.1
  2. pytorch=1.9.0
  3. gym=0.18.0
  4. scipy=1.7.0
  5. matplotlib=3.4.2
  6. box2d-py=2.3.8
  7. shapely=1.7.1

II. Replicating the results in the RSS 2021 paper

Each script will automatically generate a folder under experiments/ containing visualizations of the the training process and the weights of trained model. In addition, the script will generate a train.pkl file, which contains the following:

  • training loss
  • training accuracy
  • trajectory rollout outcome starting from a grid of states
  • action taken from a grid of states
  1. Lunar lander in Figure 1
    python3 sim_lunar_lander.py -sf
  1. Point object in Figure 2
    python3 sim_naive.py -w -sf -a -g 0.9 -mu 12000000 -cp 600000 -ut 20 -n anneal
  1. Point object in Figure 3
    python3 sim_naive.py -sf -g 0.9999 -n 9999
  1. Point object in Figure 4
    python3 sim_show.py -sf -g 0.9999 -n 9999
  1. Dubins car in Figure 5
    python3 sim_car_one.py -sf -w -wi 5000 -g 0.9999 -n 9999
  1. Dubins car (attack-defense game) in Figure 7 (Section IV.D):
    python3 sim_car_pe.py -sf -w -wi 30000 -g 0.9999 -n 9999

Paper Citation

If you use this code or find it helpful, please consider citing the companion RSS 2021 paper as:

@INPROCEEDINGS{hsu2021safety,
    AUTHOR    = {Kai-Chieh Hsu$^*$ and Vicenç Rubies-Royo$^*$ and Claire J. Tomlin and Jaime F. Fisac},
    TITLE     = {Safety and Liveness Guarantees through Reach-Avoid Reinforcement Learning},
    BOOKTITLE = {Proceedings of Robotics: Science and Systems},
    YEAR      = {2021},
    ADDRESS   = {Virtual},
    MONTH     = {July},
    DOI       = {10.15607/RSS.2021.XVII.077}
}

safety_rl's People

Contributors

kaichiehhsu avatar mrrubyred avatar nflu avatar jfisac avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.