jsalbert / music-genre-classification-with-deep-learning Goto Github PK

View Code? Open in Web Editor NEW

145.0 145.0 36.0 13.7 MB

Using deep learning to predict the genre of a song.

Home Page: https://jsalbert.github.io/Music-Genre-Classification-with-Deep-Learning/

License: MIT License

Python 100.00%

deep-learning gtzan-dataset keras music music-genre-classification

music-genre-classification-with-deep-learning's Introduction

Hi there, I'm Albert 👋

I am a machine learning engineer with extensive experience in deep learning, computer vision, natural language processing, generative models, and audio processing.

📜 My Values

🌟 Open minded and curious

⚡️ Passionate about solving complex problems

🍏 Honest and transparent

👨🏻‍💻 Lifelong learner

✍🏻 Latest Blog Posts

📬 Get in touch

music-genre-classification-with-deep-learning's People

Contributors

Stargazers

Watchers

music-genre-classification-with-deep-learning's Issues

audio_processor.py : ValueError: operands could not be broadcast together with shapes (1,257) (0,)

Hi, I'm trying to work on your codes but have a problem.
The problem I'm having is the value error. ValueError: operands could not be broadcast together with shapes (1,257) (0,)
It looks like dimension problem but can't solve it.

Error

Traceback (most recent call last):
File "G:/Music-Genre-Classification-with-Deep-Learning-master/quick_test.py", line 49, in
X_test, num_frames_test= extract_melgrams(test_songs_list, MULTIFRAMES, process_all_song=False, num_songs_genre='')
File "G:\Music-Genre-Classification-with-Deep-Learning-master\utils.py", line 111, in extract_melgrams
melgram = ap.compute_melgram_multiframe(song_path, process_all_song)
File "G:\Music-Genre-Classification-with-Deep-Learning-master\audio_processor.py", line 99, in compute_melgram_multiframe
n_fft=N_FFT, n_mels=N_MELS)**2,
File "C:\Program Files\Python37\lib\site-packages\librosa\feature\spectral.py", line 1371, in melspectrogram
mel_basis = filters.mel(sr, n_fft, **kwargs)
File "C:\Program Files\Python37\lib\site-packages\librosa\filters.py", line 238, in mel
lower = -ramps[i] / fdiff[i]
ValueError: operands could not be broadcast together with shapes (1,257) (0,)

Plz can anyone help me with this!!!

tagger_net.py

Hi, I'm trying to work on your codes but have a problem.

Basic informations are tensorflow : 1.8 (But I have tried 1.2, 0.12)
theano:0.82 (also tried 0.9)
keras: 1.1

The problem I'm having is the value error. ValueError: total size of new array must be unchanged.
It looks like dimension problem but can't solve it.

Using Theano backend.
/usr/local/lib/python2.7/dist-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "quick_test.py", line 36, in
model = MusicTaggerCRNN(weights=None, input_tensor=(1, 96, 1366))
File "/home/lg/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py", line 116, in MusicTaggerCRNN
x = Reshape((15, 128))(x)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 514, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 572, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 152, in create_node
output_shapes = to_list(outbound_layer.get_output_shape_for(input_shapes[0]))
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 304, in get_output_shape_for
return (input_shape[0],) + self._fix_unknown_dimension(input_shape[1:], self.target_shape)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 299, in _fix_unknown_dimension
raise ValueError(msg)
ValueError: total size of new array must be unchanged

Does anybody solve this problem?

Citation ...

If i cite your work then how would it do it ? Does this work published in some journal ?

How should it work on GTZAN dataset?

I fill the music folder with the dataset & write filenames in list_example.txt, after I run the program I got an error which is:
Traceback (most recent call last):
File "quick_test.py", line 47, in
X_test, num_frames_test= extract_melgrams(test_songs_list, MULTIFRAMES, process_all_song=False, num_songs_genre='')
File "/home/lauyick/Workspaces/python/DLR-DQN/utils.py", line 124, in extract_melgrams
melgrams = np.concatenate((melgrams, melgram), axis=0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
It seems that something wrong with the array size when using numpy.

TypeError: ('Keyword argument not understood:', 'mode')

TypeError Traceback (most recent call last)
in
----> 1 model = MusicTaggerCNN(weights=None, input_tensor=(1, 96, 1366))

in MusicTaggerCNN(weights, input_tensor)
61 # Conv block 1
62 x = Convolution2D(32, 3, 3, padding='same', name='conv1', trainable=False)(x)
---> 63 x = BatchNormalization(axis=channel_axis, mode=0, name='bn1', trainable=False)(x)
64 x = ELU()(x)
65 x = MaxPooling2D(pool_size=(2, 4), name='pool1', trainable=False)(x)

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\keras\layers\normalization.py in init(self, axis, momentum, epsilon, center, scale, beta_initializer, gamma_initializer, moving_mean_initializer, moving_variance_initializer, beta_regularizer, gamma_regularizer, beta_constraint, gamma_constraint, renorm, renorm_clipping, renorm_momentum, fused, trainable, virtual_batch_size, adjustment, name, **kwargs)
182 name=None,
183 **kwargs):
--> 184 super(BatchNormalizationBase, self).init(name=name, **kwargs)
185 if isinstance(axis, (list, tuple)):
186 self.axis = axis[:]

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\training\tracking\base.py in _method_wrapper(self, *args, **kwargs)
515 self._self_setattr_tracking = False # pylint: disable=protected-access
516 try:
--> 517 result = method(self, *args, **kwargs)
518 finally:
519 self._self_setattr_tracking = previous_value # pylint: disable=protected-access

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in init(self, trainable, name, dtype, dynamic, **kwargs)
338 }
339 # Validate optional keyword arguments.
--> 340 generic_utils.validate_kwargs(kwargs, allowed_kwargs)
341
342 # Mutable properties

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py in validate_kwargs(kwargs, allowed_kwargs, error_message)
806 for kwarg in kwargs:
807 if kwarg not in allowed_kwargs:
--> 808 raise TypeError(error_message, kwarg)
809
810

TypeError: ('Keyword argument not understood:', 'mode')

cannot run "quick_test.py"

while running quicktest.py
Using TensorFlow backend.

Traceback (most recent call last):
File "I:\Coding\Music-Genre-Classification-with-Deep-Learning-master\quick_test.py", line 1, in
from keras import backend as K
File "C:\Python27\lib\site-packages\keras_init_.py", line 2, in
from . import backend
File "C:\Python27\lib\site-packages\keras\backend_init_.py", line 64, in
from .tensorflow_backend import *
File "C:\Python27\lib\site-packages\keras\backend\tensorflow_backend.py", line 1, in
import tensorflow as tf
ImportError: No module named tensorflow
but tensorflow does not support python2.7.14

ValueError: total size of new array must be unchanged

x = Reshape((15,128))(x)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py", line 468, in call
output_shape = self.compute_output_shape(input_shape)
File "/usr/local/lib/python3.6/dist-packages/keras/layers/core.py", line 399, in compute_output_shape
input_shape[1:], self.target_shape)
File "/usr/local/lib/python3.6/dist-packages/keras/layers/core.py", line 387, in _fix_unknown_dimension
raise ValueError(msg)
ValueError: total size of new array must be unchanged

Issue while running quick_test.py

Using TensorFlow backend.
/home/user/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py:86: UserWarning: Update your Conv2D call to the Keras 2 API: Conv2D(64, (3, 3), padding="same", trainable=False, name="conv1")
x = Convolution2D(64, 3, 3, border_mode='same', name='conv1', trainable=False)(x)
/home/user/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py:87: UserWarning: Update your BatchNormalization call to the Keras 2 API: BatchNormalization(trainable=False, name="bn1", axis=3)
x = BatchNormalization(axis=channel_axis, mode=0, name='bn1', trainable=False)(x)
2018-02-23 10:25:43.298633: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Traceback (most recent call last):
File "quick_test.py", line 36, in
model = MusicTaggerCRNN(weights=None, input_tensor=(1, 96, 1366))
File "/home/user/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py", line 89, in MusicTaggerCRNN
x = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), name='pool1', trainable=False)(x)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 617, in call
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/pooling.py", line 158, in call
data_format=self.data_format)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/pooling.py", line 221, in _pooling_function
pool_mode='max')
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 3654, in pool2d
data_format=tf_data_format)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 2043, in max_pool
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 3018, in _max_pool
data_format=data_format, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3162, in create_op
compute_device=compute_device)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3208, in _create_op_helper
set_shapes_for_outputs(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2427, in set_shapes_for_outputs
return _set_shapes_for_outputs(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2400, in _set_shapes_for_outputs
shapes = shape_func(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2330, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
require_shape_fn)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 2 from 1 for 'pool1/MaxPool' (op: 'MaxPool') with input shapes: [?,1,170,64].