Coder Social home page Coder Social logo

audio-style-transfer's Introduction

Audio Style Transfer

Introduction

Style transfer is a concept which is successfully applied to image domain with the example of creating a Van Gogh painting from any given input image. [1] Aim of this project is to adapt the "style transfer" concept to audio domain. Specifically, we aim to transfer the style of an audio (preferably a song) which is labeled as the "style", to another audio which is labeled as the "content", and synthesize a new audio with the general characteristics of the "style" by also remaining loyal to the "content". Through this goal, we can take a step forward for understanding the features of raw music audio signals such as the style, melody, rhythm, and tempo.

Some of the proposed solutions to this problem in the literature include using multiple time-frequency representations [2], short time Fourier transform and Griffin-Lim algorithm [3], and shallow convolutional networks [4]. We aim to implement some of these methods, use the results we will obtain as baselines and try to improve the baseline results by using different features, methods, and models. We want to contribute to this relatively new field of research and come up with interesting results which may bring more attention to the subject.

Progress

We implement and try two baseline implementations, one from the paper of Mital and the other from the blog post of Ulyanov.

Papers

Neural Style Transfer for Audio Spectograms

  • NIPS 2017 Workshop paper

Audio style transfer

  • IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2018

Time Domain Neural Audio Style Transfer (Baseline implementation: Mital)

Blogs

Audio texture synthesis and style transfer (Baseline implementation: Ulyanov)

Neural Style Transfer on Audio Signals

References

[1] A Neural Algorithm of Artistic Style, https://arxiv.org/abs/1508.06576

[2] “Style” Transfer for Musical Audio Using Multiple Time-Frequency Representations, https://openreview.net/forum?id=BybQ7zWCb

[3] Audio texture synthesis and style transfer, https://dmitryulyanov.github.io/audio-texture-synthesis-and-style-transfer/

[4] Time Domain Neural Audio Style Transfer, https://arxiv.org/abs/1711.11160

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.