Coder Social home page Coder Social logo

rnn_midi_composer's Introduction

RNN_MIDI_Composer

Training a LSTM on Indonesian Folk Songs in MIDI format to compose a new MIDI music.
Have a listen:

Have a look:

Dependencies

  • numpy
  • pandas
  • pytorch==0.4.1
  • plac
  • tqdm

How to prepare your data

Convert your midi file into .csv using Midicsv[1], and put them in a folder, by default, in the dataset folder. It is recommended to remove channels that contain repetitive music (usually background sound such as drums and snare) to avoid the RNN produce uninteresting repetitive sound. I do perform this data cleaning in the Indonesian Folk Song Dataset.

How to train model

The training is executed through a command-line interface (CLI). Check for the cli help documentation.

python train.py -h

You may also use the default value by simply run

python train.py

You can visualize the model performance using the Music Composer.ipynb notebook while training.
Note: The program will keep running unless you interrupt it with ctrl + c.

Parameters in Training configuration

  • n_hidden / -nh
    Number of hidden unit.
  • n_layers / -nl
    Number of hidden layer.
  • bs / -bs
    batch size
  • seq_len / -sl
    Length of input sequence.
  • lr / -lr
    Learning Rate
  • d_out / -do
    Dropout rate
  • save_every / -se
    Number of steps for a model to be saved
  • print_every / -pe
    Number of steps that the training information (loss, etc.) will be printed
  • name / -o
    Folder Name for the model. It will create a new folder with this name if the folder is not found.
  • midi_source_folder / -i
    Folder Name for the data. It must have the .csv files in Midicsv[1] format.

How to compose music

Use the Music Composer.ipynb notebook. Load the model, then set your desired configuration.

I have prepared some generated music in the sample folder. Use Midicsv[1] to convert it back to midi file, then you can open it with common midi player or you can try MidiEditor[2]

Parameters in Composing configuration

  • fname
    The name used for the generated music (.csv). You need to convert it back to .mid using Midicsv[1]
  • prime
    Prime for the RNN to compose the characters
  • top_k
    Take top k most probable prediction to randomly choose from. top_k = 1 means that we always use the most probable character. Higher top_k will produce more creative music (relative to the dataset). I would recommend around 3-5. If top_k value is too large, the prediction may not follow the desired format to be converted back to .mid format.
  • compose_len
    Length of character to compose. One music note will need 8-14 characters.
  • channel
    The midi channels and track number. For example, [0, 1, 2] means three channels, with each Track 0, 1 and 2.

Troubleshooting

  • If Retry music composing... keeps on popping
    It is caused by our model does not follow the format. For example, we would want C5-512-1024, but the model generated C5--512-1024. In analogy with char-RNN for paragraph generation, it is like a typo.
    You can try to use less channel, decrease top_k, decrease compose_len, train longer, or get more data. Less top_k helps because it will follow the proper format of the data instead of randomly generate characters. The same with longer training, and more data so that it can properly learn the format. Lower compose_len, instead, just to avoid this problem before it happens. Less channel is a must, the more you try to generate, the more chances that the model broke the format.
  • If the model replicates the music from dataset
    It is overfitting. You can try to decrease the model complexity (less n_hidden, n_layers, seq_len), choose a model with lower epoch (higher loss model), or increase the d_out.
  • If the generated music sounds gibberish
    Your data may be too complex. Try a more homogenous data.

Sample Result

Have a listen:

Here is the Loss history

Note: You does not have to push the Loss to minimum to generate a good music.

References

This project will not succeed without these references. Thank you indeed!

rnn_midi_composer's People

Contributors

wiradkp avatar andrejhatzi avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.