Coder Social home page Coder Social logo

1aienthusiast / audiocraft-infinity-webui Goto Github PK

View Code? Open in Web Editor NEW
152.0 3.0 19.0 103 KB

License: GNU Affero General Public License v3.0

Python 100.00%
agplv3 artificial-intelligence audiocraft generation machine-learning ml music music-generation musicgen open-source

audiocraft-infinity-webui's Introduction

Audiocraft Infinity WebUI

Adds generation of songs with a length of over 30 seconds.

Adds the ability to continue songs.

Adds a seed option.

Adds ability to load locally downloaded models.

Adds training (Thanks to chavinlo's repo https://github.com/chavinlo/musicgen_trainer)

Adds MacOS support.

Adds queue (on the main-queue branch: https://github.com/1aienthusiast/audiocraft-infinity-webui/tree/main-queue)

Batching (run webuibatch.py instead of webui.py)

Disables (hopefully) the gradio analytics.

Note! Project is currently not actively maintained but accepts PRs.

Installation

Python 3.9 is recommended.

  1. Clone the repo: git clone https://github.com/1aienthusiast/audiocraft-infinity-webui.git
  2. Install pytorch: pip install 'torch>=2.0'
  3. Install the requirements: pip install -r requirements.txt
  4. Clone my fork of the Meta audiocraft repo and chavinlo's MusicGen trainer inside the repositories folder:
cd repositories
git clone https://github.com/1aienthusiast/audiocraft
git clone https://github.com/chavinlo/musicgen_trainer
cd ..

Note!

If you already cloned the Meta audiocraft repo you have to remove it then clone the provided fork for the seed option to work.

cd repositories
rm -rf audiocraft/
git clone https://github.com/1aienthusiast/audiocraft
git clone https://github.com/chavinlo/musicgen_trainer
cd ..

Usage

python webui.py python webuibatch.py - with batching support

Updating

Run git pull inside the root folder to update the webui, and the same command inside repositories/audiocraft to update audiocraft.

Models

Meta provides 4 pre-trained models. The pre trained models are:

Needs a GPU!

I recommend 12GB of VRAM for the large model.

Training

Dataset Creation

Create a folder, in it, place your audio and caption files. They must be WAV and TXT format respectively.

Place the folder in training/datasets/.

Important: Split your audios in 35 second chunks. Only the first 30 seconds will be processed. Audio cannot be less than 30 seconds.

In this example, segment_000.txt contains the caption "jazz music, jobim" for wav file segment_000.wav

Options

  • dataset_path - path to your dataset with WAV and TXT pairs.
  • model_id - MusicGen model to use. Can be small/medium/large. Default: small - model it will be finetuned on
  • lr: Float, learning rate. Default: 0.0001/1e-4
  • epochs: Integer, epoch count. Default: 5
  • use_wandb: Integer, 1 to enable wandb, 0 to disable it. Default: 0 = Disabled
  • save_step: Integer, amount of steps to save a checkpoint. Default: None

Models

Once training finishes, the model (and checkpoints) will be available under the models/ directory.

Loading the finetuned models

Model gets saved to models/ as lm_final.pt

  1. Place it in models/DIRECTORY_NAME/
  2. In the Inference tab choose custom as the model and enter DIRECTORY_NAME into the input field.
  3. In the Inference tab choose the model it was finetuned on

Colab

For google colab you need to use the --share flag.

License

  • The code in this repository is released under the AGPLv3 license as found in the LICENSE file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.