Coder Social home page Coder Social logo

baithoven's Introduction

bAIthoven

An AI Music Generator

Inspiration

A fascinating discipline which people would never suspect could be revolutionized with Artificial Intelligence is music production. Its future is very exciting, with startups such as OpenAI and AIVA currently developing AI music generation software. This way, people with no knowledge of music theory can start a successful career as an artist by generating their own original music simply by experimenting with the software. Not only this, but even artists from the past are being reborn. OpenAI in particular has successfully created new music titles of deceased artists by running MIDI files of their old songs through a neural network. This is simply crazy and shows the potential this sort of technology will have.

Having said that, our interest was peaked, and we wanted to delve into this niche ourselves. This was our first endeavor into what we believe will be a highly profitable industry in the near future: AI-Generated Music Services/Platforms.

What it does

bAIthoven generates completely unique piano songs in the style of Beethoven. The LSTM model it implements was trained off of 29 of Beethoven's most famous pieces.

How we built it

The first step we needed to take prior to training our model was to pre-process the data so it could be accepted by our Rerecurrent (LSTM) Neural Network. As mentioned earlier, the raw data were piano tracks in the form of MIDI files. Using the Music21 API we split the data into two types: Notes and Chords. Notes consisted of pitch, octave, and offset, while chords were containers for sets of notes that were played at the same time. Now, on a basic level, the way that music generation functions is by predicting which note or chord will be played next. So, our prediction model had to contain a sample of every note or chord from all of our training data. To do this we put all the notes and chords into a sequential list so we could create the sequences that served as the input of our network. Lastly, we used a mapping function to map from string-based categorical data to integer-based numerical data. This is done because neural networks perform much better with integer-based numerical data than string-based categorical data.

Next, we created the actual training model itself. This consisted of four different types of layers:

  • LSTM layers, which took a sequence as an input and returned either sequences or a matrix.
  • Dropout layers which are regularisation techniques that consist of setting a fraction of input units to 0 at each update during the training to prevent overfitting (causes the model to misrepresent the data from which it learned i.e. inaccurate model).
  • Dense layers which are fully connected neural network layers where each input node is connected to each output node.
  • Activation layer which determined what activation function our neural network uses to calculate the output of a node.

Our training model was a network consisting of three LSTM layers, three Dropout layers, two Dense layers and one activation layer. As is the case with any neural network, we had to calculate the loss for each iteration of the training. We used categorical cross entropy, and to optimise our network we used a RMSprop optimizer as it is usually a very good choice for recurrent neural networks.

The model.fit() function in Keras was used to train the network. The first parameter was the list of input sequences that we prepared and the second was a list of their respective outputs. We trained the network for 200 epochs (iterations), with each batch propagated through the network containing 128 samples. This took a total time of 6 hours.

We chose to generate 500 notes using the network since which is roughly two minutes of music and gives the network plenty of space to create a melody. For each note generated we had to submit a sequence to the network. The first sequence we submitted was the sequence of notes at the starting index. For every subsequent sequence that we used as input, we removed the first note of the sequence and inserted the output of the previous iteration at the end of the sequence. To determine the most likely prediction from the output from the network, we extracted the index of the highest value. Afterwards, we collected all the outputs from the network into a single array, creating an array of Note and Chord objects.

Challenges we ran into

Other than time constraints, our main issue was simply learning what the best model to use was. While companies such as OpenAI, as well as some other open-source projects on the internet, use Markov regression as their model for AI-generated music platforms, we ended up choosing LSTM for its beginner-friendly integration system. Doing research prior to starting to actually code was a challenge for sure.

Accomplishments that we're proud of

This was our first endeavor into prediction based algorithms. While we have worked with simple feed-forward neural networks, this is the first time we experienced a model that predicts and creates an output as it is being trained, not just a model that is initially trained and does not expand its knowledge past that point.

What we learned

We learned a lot about Neural Networks, and how AI is trained for music in general. This was a great experience and immersive introduction to this genre of Machine Learning.

What's next for bAIthoven: An AI Music Generator

In the future, we plan on having more epochs for more sophisticated music. Also, potentially turning this into either a service or application for anyone to use, regardless of their knowledge of programming.

baithoven's People

Contributors

mattyt03 avatar

Stargazers

Ayan avatar Jayit Saha avatar

Watchers

Chipl avatar  avatar

baithoven's Issues

error occurred during execute rnn

I ve got error during try to run rnn.py file as

`[ValueError Traceback (most recent call last)
in
1 if name == 'main':
----> 2 train_network()

in train_network()
5 # get amount of pitch names
6 n_vocab = len(set(notes))
----> 7 network_input, network_output = prepare_sequences(notes, n_vocab)
8 model = create_network(network_input, n_vocab)
9 train(model, network_input, network_output)

in prepare_sequences(notes, n_vocab)
22 # normalize input
23 network_input = network_input / float(n_vocab)
---> 24 network_output = np_utils.to_categorical(network_output)
25 return (network_input, network_output)

~/python-envs/tfenv/lib/python3.8/site-packages/tensorflow/python/keras/utils/np_utils.py in to_categorical(y, num_classes, dtype)
73 y = y.ravel()
74 if not num_classes:
---> 75 num_classes = np.max(y) + 1
76 n = y.shape[0]
77 categorical = np.zeros((n, num_classes), dtype=dtype)

<array_function internals> in amax(*args, **kwargs)

~/python-envs/tfenv/lib/python3.8/site-packages/numpy/core/fromnumeric.py in amax(a, axis, out, keepdims, initial, where)
2665 5
2666 """
-> 2667 return _wrapreduction(a, np.maximum, 'max', axis, None, out,
2668 keepdims=keepdims, initial=initial, where=where)
2669

~/python-envs/tfenv/lib/python3.8/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
88 return reduction(axis=axis, out=out, **passkwargs)
89
---> 90 return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
91
92

ValueError: zero-size array to reduction operation maximum which has no identity
]`

How do I pass over this error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.