Coder Social home page Coder Social logo

opendilab / awesome-exploration-rl Goto Github PK

View Code? Open in Web Editor NEW
354.0 6.0 9.0 2.48 MB

A curated list of awesome exploration RL resources (continually updated)

License: Apache License 2.0

exploration-exploitation reinforcement-learning awesome-list hard-exploration awesome delayed-rewards exploration exploratory reinforcement-learning-algorithms sparse-reward-algorithms

awesome-exploration-rl's Introduction

Awesome Exploration Methods in Reinforcement Learning

Updated on 2024.06.12

  • Here is a collection of research papers for Exploration methods in Reinforcement Learning (ERL). The repository will be continuously updated to track the frontier of ERL. Welcome to follow and star!

  • The balance of exploration and exploitation is one of the most central problems in reinforcement learning. In order to give readers an intuitive feeling for exploration, we provide a visualization of a typical hard exploration environment in MiniGrid below. In this task, a series of actions to achieve the goal often require dozens or even hundreds of steps, in which the agent needs to fully explore different state-action spaces in order to learn the skills required to achieve the goal.

minigrid_hard_exploration
A typical hard-exploration environment: MiniGrid-ObstructedMaze-Full-v0.

Table of Contents

A Taxonomy of Exploration RL Methods

(Click to Collapse)

In general, we can divide reinforcement learning process into two phases: collect phase and train phase. In the collect phase, the agent chooses actions based on the current policy and then interacts with the environment to collect useful experience. In the train phase, the agent uses the collected experience to update the current policy to obtain a better performing policy.

According to the phase the exploration component is explicitly applied, we simply divide the methods in Exploration RL into two main categories: Augmented Collecting Strategy, Augmented Training Strategy:

  • Augmented Collecting Strategy represents a variety of different exploration strategies commonly used in the collect phase, which we further divide into four categories:

    • Action Selection Perturbation
    • Action Selection Guidance
    • State Selection Guidance
    • Parameter Space Perturbation
  • Augmented Training Strategy represents a variety of different exploration strategies commonly used in the train phase, which we further divide into seven categories:

    • Count Based
    • Prediction Based
    • Information Theory Based
    • Entropy Augmented
    • Bayesian Posterior Based
    • Goal Based
    • (Expert) Demo Data

Note that there may be overlap between these categories, and an algorithm may belong to several of them. For other detailed survey on exploration methods in RL, you can refer to Tianpei Yang et al and Susan Amin et al.


A non-exhaustive, but useful taxonomy of methods in Exploration RL. We provide some example methods for each of the different categories, shown in blue area above.

Here are the links to the papers that appeared in the taxonomy:

[1] Go-Explore: Adrien Ecoffet et al, 2021
[2] NoisyNet, Meire Fortunato et al, 2018
[3] DQN-PixelCNN: Marc G. Bellemare et al, 2016
[4] #Exploration Haoran Tang et al, 2017
[5] EX2: Justin Fu et al, 2017
[6] ICM: Deepak Pathak et al, 2018
[7] RND: Yuri Burda et al, 2018
[8] NGU: Adrià Puigdomènech Badia et al, 2020
[9] Agent57: Adrià Puigdomènech Badia et al, 2020
[10] VIME: Rein Houthooft et al, 2016
[11] EMI: Wang et al, 2019
[12] DIYAN: Benjamin Eysenbach et al, 2019
[13] SAC: Tuomas Haarnoja et al, 2018
[14] BootstrappedDQN: Ian Osband et al, 2016
[15] PSRL: Ian Osband et al, 2013
[16] HER Marcin Andrychowicz et al, 2017
[17] DQfD: Todd Hester et al, 2018
[18] R2D3: Caglar Gulcehre et al, 2019

Papers

format:
- [title](paper link) (presentation type, openreview score [if the score is public])
  - author1, author2, author3, ...
  - Key: key problems and insights
  - ExpEnv: experiment environments

ICLR 2024

(Click to Collapse)

NeurIPS 2023

(Click to Collapse)

ICML 2023

(Click to Collapse)

ICLR 2023

(Click to Collapse)

NeurIPS 2022

(Click to Collapse)

ICML 2022

(Click to Collapse)

ICLR 2022

(Click to Collapse)

NeurIPS 2021

(Click to Collapse)

Classic Exploration RL Papers

(Click to Collapse)

Contributing

Our purpose is to provide a starting paper guide to who are interested in exploration methods in RL. If you are interested in contributing, please refer to HERE for instructions in contribution.

License

Awesome Exploration RL is released under the Apache 2.0 license.

(Back to top)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.