Trainer for audio-diffusion-pytorch
(Optional) Create virtual environment and activate it
python3 -m venv venv
source venv/bin/activate
Install requirements
pip install -r requirements.txt
Add environment variables, rename .env.tmp
to .env
and replace with your own variables (example values are random)
DIR_LOGS=/logs
DIR_DATA=/data
# Required if using wandb logger
WANDB_PROJECT=audioproject
WANDB_ENTITY=johndoe
WANDB_API_KEY=a21dzbqlybbzccqla4txa21dzbqlybbzccqla4tx
# Required if using Common Voice dataset
HUGGINGFACE_TOKEN=hf_NUNySPyUNsmRIb9sUC4FKR2hIeacJOr4Rm
Run test experiment, see the exp
folder for other experiments (create your own .yaml
file there to run a custom experiment!)
python train.py exp=base_test
Run on GPU(s)
python train.py exp=base_test trainer.gpus=1
Resume run from a checkpoint
python train.py exp=base_test +ckpt=/logs/ckpts/2022-08-17-01-22-18/'last.ckpt'
How do I use the CommonVoice dataset?
Before running an experiment on commonvoice dataset you have to:
- Create a Huggingface account if you don't already have one here
- Accept the terms of the version of common voice dataset you will be using by clicking on it and selecting "Access repository".
- Add your access token to the
.env
file, for exampleHUGGINGFACE_TOKEN=hf_NUNySPyUNsmRIb9sUC4FKR2hIeacJOr4Rm
.
How do I load the model once I'm done training?
If you want to load the checkpoint to restore training with the trainer you can do python train.py exp=my_experiment +ckpt=/logs/ckpts/2022-08-17-01-22-18/'last.ckpt'
.
Otherwise if you want to instantiate a model from the checkpoint:
from main.mymodule import Model
model = Model.load_from_checkpoint(
checkpoint_path='my_checkpoint.ckpt',
learning_rate=1e-4,
beta1=0.9,
beta2=0.99,
in_channels=1,
patch_size=16,
all_other_paratemeters_here...
)
to get only the PyTorch .pt
checkpoint you can save the internal model weights as torch.save(model.model.state_dict(), 'torchckpt.pt')
.