Coder Social home page Coder Social logo

backproptools / backproptools Goto Github PK

View Code? Open in Web Editor NEW
13.0 2.0 0.0 1.98 MB

A Fast, Portable Deep Reinforcement Learning Library for Continuous Control

Home Page: https://backprop.tools

License: MIT License

CMake 4.29% C++ 74.88% C 3.90% Cuda 13.15% CSS 0.46% HTML 1.13% Shell 0.27% JavaScript 0.46% Python 1.08% Dockerfile 0.39%
continuous-control cpp deep-learning mujoco reinforcement-learning tinyrl robotics

backproptools's Introduction

BackpropTools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control

Paper on arXiv | Live demo (browser)
Run tutorials on Binder Documentation

animated
Trained on a 2020 MacBook Pro (M1) using BackpropTools TD3

animated
Trained on a 2020 MacBook Pro (M1) using BackpropTools PPO

Content

Citing

When using BackpropTools in an academic work please cite our publication using the following Bibtex citation:

@misc{eschmann2023backproptools,
      title={BackpropTools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control}, 
      author={Jonas Eschmann and Dario Albani and Giuseppe Loianno},
      year={2023},
      eprint={2306.03530},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Getting Started

The getting started documentation is divided in two parts: a tutorial on how BackpropTools works internally and replication instructions for the results from the paper.

Tutorial on BackpropTools internals

Chapter Name Documentation Interactive Notebook
0 Overview readthedocs -
1 Containers readthedocs Binder
2 Multiple Dispatch readthedocs Binder
3 Deep Learning readthedocs Binder
4 CPU Acceleration readthedocs Binder
5 MNIST Classification readthedocs Binder
6 Deep Reinforcement Learning readthedocs Binder

Running Prebuilt Binaries

For easy replication we provide prebuilt, standalone binaries for macOS, Linux and Windows. Just download the latest release for your platform and run the executables out of the box without installing any dependencies.

Note: The macOS and Windows binaries should work out of the box for the versions: macOS >= 12 and Windows >= 10.

The Linux binaries should work on distributions with a not too old libstdc++ (e.g. Ubuntu >= 20.04 or up to date Arch Linux).

The Linux binaries can also be run under WSL2 on Windows 10 (even including GPU acceleration).

When running the Linux binaries on a headless version of Ubuntu (e.g. inside Docker or WSL) the training should work out of the box but additional dependencies are required for running the MuJoCo user interface to visualize the policy after training. The instructions to install the aforementioned dependencies is in the readme_linux.txt of the Linux release.

Cloning the repository

To build the examples from source (either in Docker or natively), first the repository should be cloned. Instead of cloning all submodules using git clone --recursive which takes a lot of space and bandwidth we recommend cloning the main repo containing all the standalone code for BackpropTools and then cloning the required sets of submodules later:

git clone https://github.com/BackpropTools/BackpropTools.git

Cloning submodules

There are three classes of submodules:

  1. External dependencies (in external/)
    • E.g. HDF5 for checkpointing, Tensorboard for logging, or MuJoCo for the simulation of contact dynamics
  2. Examples/Code for embedded platforms (in embedded_platforms/)
  3. Redistributable dependencies (in redistributable/)
  4. Test dependencies (in tests/lib)
  5. Test data (in tests/data)

These sets of submodules can be cloned additively/independent of eachother. For most use-cases (like e.g. most of the Docker examples) you should clone the submodules for external dependencies:

cd BackpropTools
git submodule update --init --recursive -- external

The submodules for the embedded platforms, the redistributable binaries and test dependencies/data can be cloned in the same fashion (by replacing external with the appropriate folder from the enumeration above). Note: Make sure that for the redistributable dependencies and test data git-lfs is installed (e.g. sudo apt install git-lfs on Ubuntu) and activated (git lfs install) otherwise only the metadata of the blobs is downloaded.

Docker

The most deterministic way to get started using BackpropTools not only for replication of the results but for modifying the code is using Docker. In our experiments on Linux using the NVIDIA container runtime we were able to achieve close to native performance. Docker instructions & examples

Native

In comparison to running the release binaries or building from source in Docker, the native setup heavily depends on the configuration of the machine it is run on (installed packages, overwritten defaults etc.). Hence we provide guidelines on how to setup the environment for research and development of BackpropTools that should run on the default configuration of the particular platform but might not work out of the box if it has been customized.

Unix (Linux and macOS)

For maximum performance and malleability for research and development we recommend to run BackpropTools natively on e.g. Linux or macOS. Since BackpropTools itself is dependency free the most basic examples don't need any platform setup. However, for an improved experience, we support HDF5 checkpointing and Tensorboard logging as well as optimized BLAS libraries which comes with some system-dependent requirements. Unix instructions & examples

Windows

Windows instructions & examples

Embedded Platforms

Inference & Training

Inference

Naming Convention

We use snake_case for variables/instances, functions as well as namespaces and PascalCase for structs/classes. Furthermore, we use upper case SNAKE_CASE for compile-time constants.

backproptools's People

Contributors

darioal avatar jonas-eschmann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.