Coder Social home page Coder Social logo

rave's Introduction

rave_logo

RAVE: Realtime Audio Variational autoEncoder

Official implementation of RAVE: A variational autoencoder for fast and high-quality neural audio synthesis (article link) by Antoine Caillon and Philippe Esling.

If you use RAVE as a part of a music performance or installation, be sure to cite either this repository or the article !

Previous versions

The original implementation of the RAVE model can be restored using

git checkout v1

Installation

Install RAVE using

pip install acids-rave

You will need ffmpeg on your computer. You can install it locally inside your virtual environment using

conda install ffmpeg

Colab

A colab to train RAVEv2 is now available thanks to hexorcismos ! colab_badge

Usage

Training a RAVE model usually involves 3 separate steps, namely dataset preparation, training and export.

Dataset preparation

You can know prepare a dataset using two methods: regular and lazy. Lazy preprocessing allows RAVE to be trained directly on the raw files (i.e. mp3, ogg), without converting them first. Warning: lazy dataset loading will increase your CPU load by a large margin during training, especially on Windows. This can however be useful when training on large audio corpus which would not fit on a hard drive when uncompressed. In any case, prepare your dataset using

rave preprocess --input_path /audio/folder --output_path /dataset/path (--lazy)

Training

RAVEv2 has many different configurations. The improved version of the v1 is called v2, and can therefore be trained with

rave train --config v2 --db_path /dataset/path --name give_a_name

We also provide a discrete configuration, similar to SoundStream or EnCodec

rave train --config discrete ...

By default, RAVE is built with non-causal convolutions. If you want to make the model causal (hence lowering the overall latency of the model), you can use the causal mode

rave train --config discrete --config causal ...

Many other configuration files are available in rave/configs and can be combined. Here is a list of all the available configurations

Type Name Description
Architecture v1 Original continuous model
v2 Improved continuous model (faster, higher quality)
discrete Discrete model (similar to SoundStream or EnCodec)
onnx Noiseless v1 configuration for onnx usage
raspberry Lightweight configuration compatible with realtime RaspberryPi 4 inference
Regularization (v2 only) default Variational Auto Encoder objective (ELBO)
wassertein Wassertein Auto Encoder objective (MMD)
spherical Spherical Auto Encoder objective
Discriminator spectral_discriminator Use the MultiScale discriminator from EnCodec.
Others causal Use causal convolutions

Export

Once trained, export your model to a torchscript file using

rave export --run /path/to/your/run (--streaming)

Setting the --streaming flag will enable cached convolutions, making the model compatible with realtime processing. If you forget to use the streaming mode and try to load the model in Max, you will hear clicking artifacts.

Pretrained models

Several pretrained streaming models are available here. We'll keep the list updated with new models.

Where is the prior ?

The prior model was an experimental feature from RAVEv1 and has been removed from this repository. However, we will release a new improved version of the prior soon (very soon in fact).

Discussion

If you have questions, want to share your experience with RAVE or share musical pieces done with the model, you can use the Discussion tab !

Demonstration

RAVE x nn~

Demonstration of what you can do with RAVE and the nn~ external for maxmsp !

RAVE x nn~

embedded RAVE

Using nn~ for puredata, RAVE can be used in realtime on embedded platforms !

RAVE x nn~

rave's People

Contributors

caillonantoine avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.