Coder Social home page Coder Social logo

jafar's Introduction

Jafar: A JAX-based Genie Implementation ๐Ÿงž

Jafar is a JAX-based implementation of the DeepMind paper "Genie: Generative Interactive Environments" (Bruce et al., 2024).

Jafar supports training of all Genie components and can complete the CoinRun reproducibility experiment (Appendix F) on a single L40S GPU in under a week.

Setup ๐Ÿง—

Jafar was built with python 3.10 and jax 0.4.30. To install requirements, run:

pip install -r requirements.txt

Before training the models, generate the CoinRun dataset by running:

python generate_dataset.py --num_episodes 10000

Note: this is a large dataset (around 100GB) and may take a while to generate.

Quick Start ๐Ÿš€

Genie has three components: a video tokenizer, a latent action model, and a dynamics model. Each of these components are trained separately, however, the dynamics model requires a pre-trained video tokenizer and latent action model.

To train the video tokenizer (similar for the LAM), run:

python train_tokenizer.py --ckpt_dir <path>

Once the tokenizer and LAM are trained, the dynamics model can be trained with:

python train_dynamics.py --tokenizer_checkpoint <path> --lam_checkpoint <path>

Logging with wandb is supported. To enable logging, set the WANDB_API_KEY environment variable or run:

wandb login

Training can then be logged by setting the --log flag:

python train_tokenizer.py --log --entity <wandb-entity> --project <wandb-project>

Citing Jafar ๐Ÿ“œ

Jafar was built by Matthew Jackson and Timon Willi.

If you use Jafar in your work, please cite us and the original Genie paper as follows:

@inproceedings{
    willi2024jafar,
    title={Jafar: An Open-Source Genie Reimplemention in Jax},
    author={Timon Willi and Matthew Thomas Jackson and Jakob Nicolaus Foerster},
    booktitle={First Workshop on Controllable Video Generation @ ICML 2024},
    year={2024},
    url={https://openreview.net/forum?id=ZZGaQHs9Jb}
}
@inproceedings{
    bruce2024genie,
    title={Genie: Generative Interactive Environments},
    author={Jake Bruce and Michael D Dennis and Ashley Edwards and Jack Parker-Holder and Yuge Shi and Edward Hughes and Matthew Lai and Aditi Mavalankar and Richie Steigerwald and Chris Apps and Yusuf Aytar and Sarah Maria Elisabeth Bechtle and Feryal Behbahani and Stephanie C.Y. Chan and Nicolas Heess and Lucy Gonzalez and Simon Osindero and Sherjil Ozair and Scott Reed and Jingwei Zhang and Konrad Zolna and Jeff Clune and Nando de Freitas and Satinder Singh and Tim Rockt{\"a}schel},
    booktitle={Forty-first International Conference on Machine Learning},
    year={2024},
    url={https://openreview.net/forum?id=bJbSbJskOS}
}

jafar's People

Contributors

emptyjackson avatar

Stargazers

Marco Mistretta avatar Alberto Baldrati avatar Shenyuan Gao avatar Moritz Schneider avatar Shyam Sudhakaran avatar Edoardo avatar  avatar Ren-Jian Wang avatar Bell Chen avatar Yi-Chen Li avatar LanLingXiaoXiaoSheng avatar Hany Hamed avatar Alexey Zemtsov avatar Dominik Schmidt avatar Fernando Ribeiro avatar  avatar Edan Toledo avatar Yuhan avatar Michael Beukman avatar Evan avatar Maxim Bobrin avatar Fabio Peruzzo avatar Longtao Zheng avatar Batsirayi Ziki avatar  avatar Alexander Nikulin avatar

Watchers

Jakob Foerster avatar Chris Lu avatar Timon Willi avatar  avatar Longtao Zheng avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.