Coder Social home page Coder Social logo

auraloss's Introduction

auraloss

A collection of audio-focused loss functions in PyTorch. [PDF]

Setup

pip install git+https://github.com/csteinmetz1/auraloss

Usage

import torch
import auraloss

mrstft = auraloss.freq.MultiResolutionSTFTLoss()

input = torch.rand(8,1,44100)
target = torch.rand(8,1,44100)

loss = mrstft(input, target)

Loss functions

We categorize the loss functions as either time-domain or frequency-domain approaches. Additionally, we include perceptual transforms.

Loss function Interface Reference
Time domain
Error-to-signal ratio (ESR) auraloss.time.ESRLoss() Wright & Välimäki, 2019
DC error (DC) auraloss.time.DCLoss() Wright & Välimäki, 2019
Log hyperbolic cosine (Log-cosh) auraloss.time.LogCoshLoss() Chen et al., 2019
Signal-to-distortion ratio (SDR) auraloss.time.SDRLoss() Vincent et al., 2006
Scale-invariant signal-to-distortion
ratio (SI-SDR)
auraloss.time.SISDRLoss() Le Roux et al., 2018
Frequency domain
Spectral convergence auraloss.freq.SpectralConvergenceLoss() Arik et al., 2018
Log STFT magnitude auraloss.freq.LogSTFTMagnitudeLoss() Arik et al., 2018
Aggregate STFT auraloss.freq.STFTLoss() Arik et al., 2018
Multi-resolution STFT auraloss.freq.MultiResolutionSTFTLoss() Yamamoto et al., 2019
Random-resolution STFT auraloss.freq.RandomResolutionSTFTLoss() Steinmetz & Reiss, 2020
Sum and difference STFT loss auraloss.freq.SumAndDifferenceSTFTLoss() Steinmetz et al., 2020
Perceptual transforms
Sum and difference signal trasform auraloss.perceptual.SumAndDifference()
FIR pre-emphasis filters auraloss.perceptual.FIRFilter() Wright & Välimäki, 2019

Examples

Currently we include an example using a set of the loss functions to train a TCN for modeling an analog dynamic range compressor. For details please refer to the details in examples/compressor. We provide pre-trained models, evaluation scripts to compute the metrics in the paper, as well as scripts to retrain models.

Development

Note that a few losses have yet to be implemented (SDR, SI-SDR), but they will be coming soon. Additionally, we currently have no tests, but those will also be coming soon, so use caution at the moment. Future loss functions to be included will target neural network based perceptual losses, which tend to be a bit more sophisticated than those we have included so far.

If you are interested in adding a loss function please make a pull request.

Cite

If you use this code in your work please consider citing us.

@inproceedings{steinmetz2020auraloss,
    title={auraloss: {A}udio focused loss functions in {PyTorch}},
    author={Steinmetz, Christian J. and Reiss, Joshua D.},
    booktitle={Digital Music Research Network One-day Workshop (DMRN+15)},
    year={2020}}

auraloss's People

Contributors

csteinmetz1 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.