Coder Social home page Coder Social logo

ramstorage / lp-music-caps Goto Github PK

View Code? Open in Web Editor NEW

This project forked from seungheondoh/lp-music-caps

0.0 0.0 0.0 3.4 MB

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

Home Page: https://huggingface.co/papers/2307.16372

Python 86.93% Jupyter Notebook 13.07%

lp-music-caps's Introduction

๐Ÿ”‰ LP-MusicCaps: LLM-Based Pseudo Music Captioning

Demo Video

This is a implementation of LP-MusicCaps: LLM-Based Pseudo Music Captioning. This project aims to generate captions for music. 1) Tag-to-Caption: Using existing tags, We leverage the power of OpenAI's GPT-3.5 Turbo API to generate high-quality and contextually relevant captions based on music tag. 2) Audio-to-Caption: Using music-audio and pseudo caption pairs, we train a cross-model encoder-decoder model for end-to-end music captioning

LP-MusicCaps: LLM-Based Pseudo Music Captioning

SeungHeon Doh, Keunwoo Choi, Jongpil Lee, Juhan Nam
To appear ISMIR 2023

TL;DR

Open Source Material

are available online for future research. example of dataset in notebook

Installation

To run this project locally, follow the steps below:

  1. Install python and PyTorch:

    • python==3.10
    • torch==1.13.1 (Please install it according to your CUDA version.)
  2. Other requirements:

    • pip install -e .

Quick Start: Tag to Caption

cd lmpc/llm_captioning
python run.py --prompt {writing, summary, paraphrase, attribute_prediction} --tags <music_tags>

Replace <music_tags> with the tags you want to generate captions for. Separate multiple tags with commas, such as beatbox, finger snipping, male voice, amateur recording, medium tempo.

tag_to_caption generation writing results:

query: 
write a song description sentence including the following attributes
beatbox, finger snipping, male voice, amateur recording, medium tempo
----------
results: 
"Experience the raw and authentic energy of an amateur recording as mesmerizing beatbox rhythms intertwine with catchy finger snipping, while a soulful male voice delivers heartfelt lyrics on a medium tempo track."

Quick Start: Audio to Caption

cd demo
python app.py

# or
cd lmpc/music_captioning
wget https://huggingface.co/seungheondoh/lp-music-caps/resolve/main/transfer.pth -O exp/transfer/lp_music_caps
python captioning.py --audio_path ../../dataset/samples/orchestra.wav
{'text': "This is a symphonic orchestra playing  a piece that's riveting, thrilling and exciting. 
The peace would be suitable in a movie when something grand and impressive happens. 
There are clarinets, tubas, trumpets and french horns being played. The brass instruments help create that sense of a momentous occasion.", 
'time': '0:00-10:00'}

{'text': 'This is a classical music piece from a movie soundtrack. 
There is a clarinet playing the main melody while a brass section and a flute are playing the melody. 
The rhythmic background is provided by the acoustic drums. The atmosphere is epic and victorious. 
This piece could be used in the soundtrack of a historical drama movie during the scenes of an army marching towards the end.', 
'time': '10:00-20:00'}

{'text': 'This is a live performance of a classical music piece. There is a harp playing the melody while a horn is playing the bass line in the background. 
The atmosphere is epic. This piece could be used in the soundtrack of a historical drama movie during the scenes of an adventure video game.', 
'time': '20:00-30:00'}

Re-Implementation

Checking lpmc/llm_captioning and lpmc/music_captioning

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Acknowledgement

We would like to thank the WavCaps for audio-captioning training code and deezer-playntell for contents based captioning evaluation protocal. We would like to thank OpenAI for providing the GPT-3.5 Turbo API, which powers this project.

Citation

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follow.

Update soon

lp-music-caps's People

Contributors

seungheondoh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.