Coder Social home page Coder Social logo

jsalbert / music-genre-classification-with-deep-learning Goto Github PK

View Code? Open in Web Editor NEW
145.0 145.0 36.0 13.7 MB

Using deep learning to predict the genre of a song.

Home Page: https://jsalbert.github.io/Music-Genre-Classification-with-Deep-Learning/

License: MIT License

Python 100.00%
deep-learning gtzan-dataset keras music music-genre-classification

music-genre-classification-with-deep-learning's Introduction

jsalbert

Hi there, I'm Albert ๐Ÿ‘‹

I am a machine learning engineer with extensive experience in deep learning, computer vision, natural language processing, generative models, and audio processing.

๐Ÿ“œ My Values

๐ŸŒŸ Open minded and curious

โšก๏ธ Passionate about solving complex problems

๐Ÿ Honest and transparent

๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป Lifelong learner

โœ๐Ÿป Latest Blog Posts

๐Ÿ“ฌ Get in touch

ย ย  ย ย  ย ย  ย ย 

music-genre-classification-with-deep-learning's People

Contributors

jsalbert avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

music-genre-classification-with-deep-learning's Issues

audio_processor.py : ValueError: operands could not be broadcast together with shapes (1,257) (0,)

Hi, I'm trying to work on your codes but have a problem.
The problem I'm having is the value error. ValueError: operands could not be broadcast together with shapes (1,257) (0,)
It looks like dimension problem but can't solve it.

Error

Traceback (most recent call last):
File "G:/Music-Genre-Classification-with-Deep-Learning-master/quick_test.py", line 49, in
X_test, num_frames_test= extract_melgrams(test_songs_list, MULTIFRAMES, process_all_song=False, num_songs_genre='')
File "G:\Music-Genre-Classification-with-Deep-Learning-master\utils.py", line 111, in extract_melgrams
melgram = ap.compute_melgram_multiframe(song_path, process_all_song)
File "G:\Music-Genre-Classification-with-Deep-Learning-master\audio_processor.py", line 99, in compute_melgram_multiframe
n_fft=N_FFT, n_mels=N_MELS)**2,
File "C:\Program Files\Python37\lib\site-packages\librosa\feature\spectral.py", line 1371, in melspectrogram
mel_basis = filters.mel(sr, n_fft, **kwargs)
File "C:\Program Files\Python37\lib\site-packages\librosa\filters.py", line 238, in mel
lower = -ramps[i] / fdiff[i]
ValueError: operands could not be broadcast together with shapes (1,257) (0,)

Plz can anyone help me with this!!!

tagger_net.py

Hi, I'm trying to work on your codes but have a problem.

Basic informations are tensorflow : 1.8 (But I have tried 1.2, 0.12)
theano:0.82 (also tried 0.9)
keras: 1.1

The problem I'm having is the value error. ValueError: total size of new array must be unchanged.
It looks like dimension problem but can't solve it.

Using Theano backend.
/usr/local/lib/python2.7/dist-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "quick_test.py", line 36, in
model = MusicTaggerCRNN(weights=None, input_tensor=(1, 96, 1366))
File "/home/lg/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py", line 116, in MusicTaggerCRNN
x = Reshape((15, 128))(x)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 514, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 572, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 152, in create_node
output_shapes = to_list(outbound_layer.get_output_shape_for(input_shapes[0]))
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 304, in get_output_shape_for
return (input_shape[0],) + self._fix_unknown_dimension(input_shape[1:], self.target_shape)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 299, in _fix_unknown_dimension
raise ValueError(msg)
ValueError: total size of new array must be unchanged

Does anybody solve this problem?

Citation ...

If i cite your work then how would it do it ? Does this work published in some journal ?

How should it work on GTZAN dataset?

I fill the music folder with the dataset & write filenames in list_example.txt, after I run the program I got an error which is:
Traceback (most recent call last):
File "quick_test.py", line 47, in
X_test, num_frames_test= extract_melgrams(test_songs_list, MULTIFRAMES, process_all_song=False, num_songs_genre='')
File "/home/lauyick/Workspaces/python/DLR-DQN/utils.py", line 124, in extract_melgrams
melgrams = np.concatenate((melgrams, melgram), axis=0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
It seems that something wrong with the array size when using numpy.

TypeError: ('Keyword argument not understood:', 'mode')


TypeError Traceback (most recent call last)
in
----> 1 model = MusicTaggerCNN(weights=None, input_tensor=(1, 96, 1366))

in MusicTaggerCNN(weights, input_tensor)
61 # Conv block 1
62 x = Convolution2D(32, 3, 3, padding='same', name='conv1', trainable=False)(x)
---> 63 x = BatchNormalization(axis=channel_axis, mode=0, name='bn1', trainable=False)(x)
64 x = ELU()(x)
65 x = MaxPooling2D(pool_size=(2, 4), name='pool1', trainable=False)(x)

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\keras\layers\normalization.py in init(self, axis, momentum, epsilon, center, scale, beta_initializer, gamma_initializer, moving_mean_initializer, moving_variance_initializer, beta_regularizer, gamma_regularizer, beta_constraint, gamma_constraint, renorm, renorm_clipping, renorm_momentum, fused, trainable, virtual_batch_size, adjustment, name, **kwargs)
182 name=None,
183 **kwargs):
--> 184 super(BatchNormalizationBase, self).init(name=name, **kwargs)
185 if isinstance(axis, (list, tuple)):
186 self.axis = axis[:]

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\training\tracking\base.py in _method_wrapper(self, *args, **kwargs)
515 self._self_setattr_tracking = False # pylint: disable=protected-access
516 try:
--> 517 result = method(self, *args, **kwargs)
518 finally:
519 self._self_setattr_tracking = previous_value # pylint: disable=protected-access

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in init(self, trainable, name, dtype, dynamic, **kwargs)
338 }
339 # Validate optional keyword arguments.
--> 340 generic_utils.validate_kwargs(kwargs, allowed_kwargs)
341
342 # Mutable properties

~\anaconda3\envs\vikas\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py in validate_kwargs(kwargs, allowed_kwargs, error_message)
806 for kwarg in kwargs:
807 if kwarg not in allowed_kwargs:
--> 808 raise TypeError(error_message, kwarg)
809
810

TypeError: ('Keyword argument not understood:', 'mode')

cannot run "quick_test.py"

while running quicktest.py
Using TensorFlow backend.

Traceback (most recent call last):
File "I:\Coding\Music-Genre-Classification-with-Deep-Learning-master\quick_test.py", line 1, in
from keras import backend as K
File "C:\Python27\lib\site-packages\keras_init_.py", line 2, in
from . import backend
File "C:\Python27\lib\site-packages\keras\backend_init_.py", line 64, in
from .tensorflow_backend import *
File "C:\Python27\lib\site-packages\keras\backend\tensorflow_backend.py", line 1, in
import tensorflow as tf
ImportError: No module named tensorflow
but tensorflow does not support python2.7.14

ValueError: total size of new array must be unchanged

x = Reshape((15,128))(x)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py", line 468, in call
output_shape = self.compute_output_shape(input_shape)
File "/usr/local/lib/python3.6/dist-packages/keras/layers/core.py", line 399, in compute_output_shape
input_shape[1:], self.target_shape)
File "/usr/local/lib/python3.6/dist-packages/keras/layers/core.py", line 387, in _fix_unknown_dimension
raise ValueError(msg)
ValueError: total size of new array must be unchanged

Issue while running quick_test.py

Using TensorFlow backend.
/home/user/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py:86: UserWarning: Update your Conv2D call to the Keras 2 API: Conv2D(64, (3, 3), padding="same", trainable=False, name="conv1")
x = Convolution2D(64, 3, 3, border_mode='same', name='conv1', trainable=False)(x)
/home/user/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py:87: UserWarning: Update your BatchNormalization call to the Keras 2 API: BatchNormalization(trainable=False, name="bn1", axis=3)
x = BatchNormalization(axis=channel_axis, mode=0, name='bn1', trainable=False)(x)
2018-02-23 10:25:43.298633: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Traceback (most recent call last):
File "quick_test.py", line 36, in
model = MusicTaggerCRNN(weights=None, input_tensor=(1, 96, 1366))
File "/home/user/Downloads/Music-Genre-Classification-with-Deep-Learning-master/tagger_net.py", line 89, in MusicTaggerCRNN
x = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), name='pool1', trainable=False)(x)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 617, in call
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/pooling.py", line 158, in call
data_format=self.data_format)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/pooling.py", line 221, in _pooling_function
pool_mode='max')
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 3654, in pool2d
data_format=tf_data_format)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 2043, in max_pool
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 3018, in _max_pool
data_format=data_format, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3162, in create_op
compute_device=compute_device)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3208, in _create_op_helper
set_shapes_for_outputs(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2427, in set_shapes_for_outputs
return _set_shapes_for_outputs(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2400, in _set_shapes_for_outputs
shapes = shape_func(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2330, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
require_shape_fn)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 2 from 1 for 'pool1/MaxPool' (op: 'MaxPool') with input shapes: [?,1,170,64].

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.