Coder Social home page Coder Social logo

yamathcy / lakh_vocal_segments_dataset Goto Github PK

View Code? Open in Web Editor NEW

This project forked from georgid/lakh_vocal_segments_dataset

0.0 1.0 0.0 89.9 MB

singing voice with annotations of vocal onsets, based on the matched MIDI from http://colinraffel.com/projects/lmd/

Jupyter Notebook 65.85% Shell 0.73% Python 31.90% Rich Text Format 1.52%

lakh_vocal_segments_dataset's Introduction

lakh vocal segments dataset

This is a dataset of multi-instrumental recordings of pop songs (in English) with annotations transcription of singing voice, based on the MIDI matched from the lakh dataset. Created to provide real-world material for singing vocie transciption with diverse genres and singers.

Possible use tasks:

  • vocal onset detection
  • transcription of singing voice into notes
  • beat detection

Note that it is not suitable yet for offset detection as we did not validate the correctness of the length of aligned MIDI notes

Folder structure

  • list_MSD_ids: list of the songs in the dataset
  • scripts: python scripts for loading data, more scripts are in the similar repository
  • data: audio files. excerpt.txt gives begining and ending timestamp of the 7-digital exceprt from the complete recording. Determined manually.
  • experiments: ralated to the paper - Georgi Dzhambazov, André Holzapfel, Ajay Srinivasamurthy, Xavier Serra, Metrical-Accent Aware Vocal Onset Detection in Polyphonic Audio, In Proceedings of ISMIR 2017

Criteria for inclusion in the dataset:

  • songs from the datasets used in MIREX Automatic_Lyrics-to-Audio_Alignment full-listed also here. Note that this gives initial list of songs for cross-reference, but could be extended with any other songs.
  • has a linked MIDI in the lakh dataset
  • has predominant singing voice present in the 30-seconds thumbnail
  • has some clear metrical pulsation and the meter is 4/4

Steps to derive annotations

  1. find recording MSD_TRACK_id from this list. Then match by this script
  • Then derive its beginning and ending timestamp and create data/MSD_TRACK_id/exceprt.txt manually.
  1. get the matched MIDI from lakh-matched MIDI fetch_midi (if more than one match, pick the MIDI for the best match)

  2. derive singing voice note annotations by this script Since MIDI standard does not define an instrument for singing voice, the singing voice track is given a different program # in a random channel in each MIDI. Thus one needs to manually identify the MIDI channel # that corresponds to the melody of the singing voice track Optionally, doing in advance an annotation of segments with active vocal is helpful.

  3. derive beat annotations by this script

  4. verify annotations of note onsets and beats. Correct manually some imprecise vocal annotations. Open as note layer in Sonic Visualiser by script 'sh open_in_sv.sh'

  • listen in slower motion

  • if systematic delay/advance of timestamps, measure the difference to onsets with SV's measure tool and run shift time of annotation

  • put the audio for MSD_TRACK_id in data/MSD_TRACK_id:

cp /Volumes/datasets/MTG/audio/incoming/millionsong-audio/mp3/D/W/U/$track_ID data/

Scratch notes and observations on songs so far...

observations for excluded tracks :

NELLY FORTADO MSD version has no voice I kissed a girl is in 2/4, all the rest is 4/4 viva la vida not in MSD CLOCKS has almost no voice in MSD rehab has no vocal channel in MIDI

observations for included tracks

smells like has a lot of same-pitch onsets call me - the onsets are on offbeats mostly bangles and sunrise have no percussive instruments

Citation

Georgi Dzhambazov, André Holzapfel, Ajay Srinivasamurthy, Xavier Serra, Metrical-Accent Aware Vocal Onset Detection in Polyphonic Audio, In Proceedings of ISMIR 2017

Contact

georgi (dot) dzhambazov (at) upf (dot) edu

License

The license for annotations follows the license of lakh dataset. The audio comes form the MSD and is shared here for the purpose of crowd-annotation. However, we will remove it once we release the dataset.

lakh_vocal_segments_dataset's People

Contributors

georgid avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.