Coder Social home page Coder Social logo

whitemech / temprl Goto Github PK

View Code? Open in Web Editor NEW
11.0 2.0 4.0 1.69 MB

Reinforcement Learning framework for Temporal Goals

Home Page: https://whitemech.github.io/temprl

License: GNU Lesser General Public License v3.0

Makefile 6.54% Python 93.46%
reinforcement-learning temporal-goals temporal-logic temporal-constraints automata

temprl's Introduction

temprl

PyPI PyPI - Python Version PyPI - Status PyPI - Implementation PyPI - Wheel GitHub

test lint docs codecov

black

Framework for Reinforcement Learning with Temporal Goals defined by LTLf/LDLf formulas.

Status: development.

Install

Install the package:

  • from PyPI:

      pip3 install temprl
    
  • with pip from GitHub:

      pip3 install git+https://github.com/whitemech/temprl.git
    
  • or, clone the repository and install:

      git clone htts://github.com/whitemech/temprl.git
      cd temprl
      pip install .
    

Tests

To run tests: tox

To run only the code tests: tox -e py3.7

To run only the linters:

  • tox -e flake8
  • tox -e mypy
  • tox -e black-check
  • tox -e isort-check

Please look at the tox.ini file for the full list of supported commands.

Docs

To build the docs: mkdocs build

To view documentation in a browser: mkdocs serve and then go to http://localhost:8000

License

temprl is released under the GNU Lesser General Public License v3.0 or later (LGPLv3+).

Copyright 2020-2022 Marco Favorito

Authors

temprl's People

Contributors

cipollone avatar gallorob avatar marcofavorito avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

temprl's Issues

Generate model from discrete env

Is your feature request related to a problem? Please describe.

In case the wrapped environment is DiscreteEnv, the wrapper forgets the model.

Describe the solution you'd like
Make TemporalGoalWrapper able to detect if the wrapped environment is an instance of DiscreteEnv, and in that case, extend the model such that the automata transitions are included.

Describe alternatives you've considered
n/a

Additional context
n/a

Let `feature_extractor` and `extract_fluents` callable being resettable.

Is your feature request related to a problem? Please describe.

TemporalGoalWrapper.feature_extractor and TemporalGoal.extract_fluents are of type callable.

It is possible to create extractor that depends on more than one state, by keeping memory of the past states:

class my_feature_extractor:
    
    def __init__(self, *args, *kwargs):
        ...

    # this method makes the class callable
    def __call__(self, obs, action):
        ...

wrapper = TemporalGoalWrapper(env, feature_extractor=my_feature_extractor(), ...)

however, the state of the extractor is kept across episodes.

Describe the solution you'd like

TemporalGoalWrapper should call a reset() method of feature_extractor and of every extract_fluents. Of course they might not have a method called reset(), in that case just skip them.

Describe alternatives you've considered

Additional context

Purpose of sink state

Subject of the issue

The RewardDFA adds a sink state. Is this an intended behaviour? There will be formulae that may never fail. Also, if a sink state exists, that will be included in the automaton already.

If we want the ability to distinguish such a state, we could detect a sink by traversing the graph.
What do you think?

Implement deterministic serialization of temporal goals.

Is your feature request related to a problem? Please describe.

Across different runs, the DFA generated by the formula is not the same. That changes the observation space the agent learns on.

Describe the solution you'd like

A way to recover the previous DFA by implementing a proper serialization method.

Describe alternatives you've considered

Additional context

Use the strategy pattern for reward shaping in `TemporalGoal` class.

Is your feature request related to a problem? Please describe.

The class TemporalGoal hard-codes the behaviour of reward shaping, that can be controlled by the flag reward_shaping passed in the constructor.

Describe the solution you'd like

Make the approach more modular and customizable by introducing the RewardShaper class, such that it makes it easier for a developer to change the default behaviour.

Describe alternatives you've considered
None.

Additional context
None.

Call 'extract_fluents' with the episode number and the step number.

Is your feature request related to a problem? Please describe.

When working with multiple temporal goals, it might happen that the fluents are extracted multiple times from the same state.

Describe the solution you'd like

Provide an episode and step argument to the extract_fluents method such that it allows caching the fluents already computed for that iteration.

Describe alternatives you've considered

Additional context

Make it possible to create a temporal goal from an automaton.

Is your feature request related to a problem? Please describe.

It is not possible to initialize the temporal goal directly from an automaton, but only from a temporal logic formula (LTLf or LDLf)

Describe the solution you'd like

Make it possible to provide just a DFA.

Describe alternatives you've considered

Additional context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.