Coder Social home page Coder Social logo

jaxued's Introduction

Welcome to JaxUED!

Get StartedPaperDocs


JaxUED is a Unsupervised Environment Design (UED) library with similar goals to CleanRL: high-quality, single-file and understandable implementations of common UED methods.

Why JaxUED?

  • Single-file reference implementations of common-UED algorithms
  • Allows easy modification and quick prototyping of ideas
  • Understandable code with low levels of obscurity/abstraction
  • Wandb integration and logging of metrics and generated levels

What We Provide

JaxUED has several (Jaxified) utilities that are useful for implementing UED methods, a LevelSampler, a general environment interface UnderspecifiedEnv, and a Maze implementation.

We also have understandable single-file implementations of DR, PLR, ACCEL and PAIRED.

Who JaxUED is for

JaxUED is primarily intended for researchers looking to get in the weeds of UED algorithm development. Our minimal dependency implementations of the current state-of-the art UED methods expose all implementation details; helping researchers understand how the algorithms work in practise, and facilitating easy, rapid prototyping of new ideas.

Get Started

See the docs for more examples and explanations of arguments, or simply read the documented code in examples/

Installation

First, clone the repository

git clone https://github.com/DramaCow/jaxued
cd jaxued

And install:

pip install -e .

Follow instructions here for jax GPU installation, and run something like the following

pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

Training

We provide three example files, examples/maze_{dr,plr,paired}.py implementing DR, PLR (and ACCEL) & PAIRED, respectively.

To run them, simply run the scripts directly. See the documentation or the files themselves for arguments.

python examples/maze_plr.py

Evaluation

After the training is completed, it will store checkpoints in ./checkpoints/<run_name>/<seed>/models/<update_step>, and the same file can be run to evaluate these checkpoints, storing the evaluation results in ./results/. The only things to change are --mode=eval, specifying --checkpoint_directory and --checkpoint_to_eval:

python examples/maze_plr.py --mode eval --checkpoint_directory=./checkpoints/<run_name>/<seed> --checkpoint_to_eval <update_step>

These result files are npz files with keys

states, cum_rewards, episode_lengths, levels

Supported Methods

Method Run Command
Domain Randomization (DR) python examples/maze_dr.py
Prioritized Level Replay (PLR) python examples/maze_plr.py --exploratory_grad_updates
Robust Prioritized Level Replay (RPLR) python examples/maze_plr.py
ACCEL python examples/maze_plr.py --use_accel
PAIRED python examples/maze_paired.py

Modification

One of the core goals of JaxUED is that our reference implementations can easily be modified to add arbitrary functionality. All of the primary functionality is provided in the file, from the PPO implementation to the specifics of each method.

So, to get started, simply copy one of the files, and start modifying the file directly.

New Environments

To implement a new environment, simply subclass the UnderspecifiedEnv interface, and in the files themselves, change

    env = Maze(max_height=13, max_width=13, agent_view_size=config["agent_view_size"], normalize_obs=True)

to

    env = MyEnv(...)

And make any other changes necessary to the network architecture, etc.

Supported Environments

Craftax

examples/craftax/craftax_plr.py contains code to run DR, PLR and ACCEL in Craftax. To use Craftax, install it using

pip install git+https://github.com/MichaelTMatthews/Craftax.git@main

Run it using the following command (see here for the full list of arguments):

python examples/craftax/craftax_plr.py --exploratory_grad_updates --num_train_envs 512 --num_updates 256

Currently, this only supports CraftaxSymbolic, but the following are coming soon:

  • Support for Pixel Environments
  • Support for Craftax-Classic
  • Support for an RNN policy

Gymnax

See examples/gymnax/gymnax_plr.py to run gymnax environments, currently supporting Acrobot, Pendulum and Cartpole. Use the --env flag with the name of the environment in lowercase to choose which is used. We have set the distribution of levels as somewhat arbitrary, changing two of the parameters of each environments (e.g. length and mass in Cartpole). This can easily be changed, however. The evaluation distribution is also somewhat arbitrary and can be easily changed.

The examples/gymnax/gymnax_plr.py can be modified to add additional environments as well.

See Also

Here are some other libraries that also leverage Jax to obtain massive speedups in RL, which acted as inspiration for JaxUED.

RL Algorithms in Jax

  • Minimax: UED baselines, with support for multi-gpu training, and more parallel versions of PLR/ACCEL
  • PureJaxRL End-to-end RL implementations in Jax
  • JaxIRL: Inverse RL
  • Mava: Multi-Agent RL
  • JaxMARL: Lots of different multi-agent RL algorithms

RL Environments in Jax

  • Gymnax: Standard RL interface with several environments, such as classic control and MinAtar.
  • JaxMARL: Lots of different multi-agent RL environments
  • Jumanji: Combinatorial Optimisation
  • Pgx: Board games, such as Chess and Go.
  • Brax: Continuous Control (like Mujoco), in Jax
  • XLand-MiniGrid: Meta RL environments, taking ideas from XLand and Minigrid
  • Craftax: Greatly extended version of Crafter in Jax.

Projects using JaxUED

  • Craftax: Using UED to generate worlds for learning an RL agent.
  • ReMiDi: JaxUED is the primary library used for baselines and the backbone for implementing ReMiDi.

📜 Citation

For attribution in academic contexts, please cite this work as

@article{coward2024JaxUED,
  title={JaxUED: A simple and useable UED library in Jax},
  author={Samuel Coward and Michael Beukman and Jakob Foerster},
  journal={arXiv preprint},
  year={2024},
}

jaxued's People

Contributors

michael-beukman avatar dramacow avatar

Stargazers

 avatar Evan avatar Josh Purtell avatar Oleksii Kachaiev avatar  avatar Andrew avatar Makdoud avatar Jayden Teoh avatar  avatar Zheng Xiong avatar Melvin Laux avatar Batsirayi Ziki avatar Fábio Vital avatar Alisson Henrique Kolling avatar Tianyuan Chen avatar Alexey Zemtsov avatar Artem Zholus avatar  avatar Adrian Borucki avatar  avatar  avatar Thomas Soares Mullen avatar Joe Eappen avatar hannah avatar Rujikorn Charakorn avatar Geo Jolly avatar Kinal Mehta avatar  avatar  avatar Jan Margeta avatar ali_robot avatar  avatar Satyam Tiwary avatar Vahid Kazemi avatar Samuel Garcin avatar Costa Huang avatar  avatar Eito Miyamura avatar Mark Towers avatar Xiaohu Zhu avatar Callum Tilbury avatar Alex Rutherford avatar Axel Brunnbauer avatar Sacha Chernyavskiy avatar  avatar Alexander Nikulin avatar Hany Hamed avatar Shyam Sudhakaran avatar Alex Goldie avatar Eltayeb Ahmed avatar Aditya Mohan avatar Matthew Jackson avatar Michael Matthews avatar  avatar Uljad avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.