Coder Social home page Coder Social logo

music-genre-recognition-gtzan's Introduction

Music genre recognition on GTZAN dataset

Understanding GTZAN dataset

  • There are samples of audio files, classified according to their genre.
  • All genres have 100 samples.

Training results

In GTZAN dataset there are two CSV files with features:

  1. features_30_sec.csv
  2. features_3_sec.csv

We'll train on these two sets to observe differences:

30 sec samples (features_30_sec.csv)

First test was done on 30 seconds of samples.

Training details
Epoch Train Loss Valid Loss Accuracy Time
0 2.410511 2.301348 0.100000 00:00
1 2.358546 2.294634 0.150000 00:00
2 2.308661 2.264426 0.200000 00:00
3 2.234084 2.203711 0.237500 00:00
4 2.171810 2.121932 0.243750 00:00
5 2.107840 2.050719 0.256250 00:00
6 2.039669 1.963892 0.281250 00:00
7 1.977348 1.895904 0.306250 00:00
8 1.910779 1.822734 0.331250 00:00
9 1.841186 1.764186 0.337500 00:00
10 1.774352 1.701079 0.356250 00:00
11 1.703704 1.629614 0.356250 00:00
12 1.633385 1.574726 0.387500 00:00
13 1.564862 1.509628 0.412500 00:00
14 1.501113 1.446676 0.475000 00:00
15 1.449700 1.387999 0.500000 00:00
16 1.374840 1.333856 0.575000 00:00
17 1.309132 1.269413 0.606250 00:00
18 1.251293 1.207156 0.625000 00:00
19 1.191447 1.157811 0.643750 00:00
20 1.127759 1.115797 0.650000 00:00
21 1.067127 1.049386 0.687500 00:00
22 1.009375 1.008271 0.706250 00:00
23 0.951300 0.977570 0.718750 00:00
24 0.893850 0.922465 0.731250 00:00
25 0.838244 0.878590 0.712500 00:00
26 0.787135 0.868914 0.725000 00:00
27 0.732795 0.853740 0.737500 00:00
28 0.678135 0.883555 0.693750 00:00
29 0.625145 0.845650 0.706250 00:00
30 0.583684 0.802656 0.725000 00:00
31 0.532835 0.794608 0.731250 00:00
32 0.489864 0.795410 0.743750 00:00
33 0.443835 0.774997 0.743750 00:00
34 0.409310 0.803072 0.725000 00:00
35 0.375299 0.759579 0.731250 00:00
36 0.340224 0.802695 0.743750 00:00
37 0.306562 0.796328 0.743750 00:00
38 0.280238 0.809968 0.750000 00:00
39 0.254191 0.783211 0.743750 00:00
40 0.228860 0.777353 0.762500 00:00
41 0.208046 0.811578 0.725000 00:00
42 0.186007 0.822938 0.731250 00:00
43 0.170087 0.735695 0.737500 00:00
44 0.159519 0.791559 0.743750 00:00
45 0.149130 0.859067 0.718750 00:00
46 0.136737 0.820262 0.762500 00:00
47 0.122899 0.873859 0.743750 00:00
48 0.116056 0.791122 0.750000 00:00
49 0.107419 0.854898 0.706250 00:00
50 0.100285 0.879887 0.718750 00:00
51 0.089521 0.862034 0.743750 00:00
52 0.081394 0.828892 0.737500 00:00
53 0.073154 0.887935 0.743750 00:00

And classification report results:

Class Precision Recall F1-Score Support
0 (blues) 0.80 0.80 0.80 20
1 (classical) 0.86 1.00 0.92 12
2 (country) 0.60 0.75 0.67 16
3 (disco) 0.67 0.67 0.67 18
4 (hiphop) 0.92 0.55 0.69 22
5 (jazz) 0.94 0.71 0.81 21
6 (metal) 0.87 0.93 0.90 14
7 (pop) 0.65 0.85 0.73 13
8 (reggae) 0.46 0.67 0.55 9
9 (rock) 0.71 0.67 0.69 15
Accuracy 0.74 160
Macro Avg 0.75 0.76 0.74 160
Weighted Avg 0.77 0.74 0.74 160

Running the training a few more times yielded results between accuracies of 70% to 74%.

3 sec samples (features_3_sec.csv)

Second test was done on 3 seconds of samples.

Training details
Epoch Train Loss Valid Loss Accuracy Time
0 2.032246 1.873680 0.360451 00:01
1 1.751166 1.637501 0.448686 00:01
2 1.570411 1.483692 0.508761 00:01
3 1.430684 1.365427 0.556320 00:01
4 1.307044 1.269431 0.596370 00:01
5 1.219212 1.175792 0.637672 00:01
6 1.138508 1.086589 0.667710 00:01
7 1.026570 1.008204 0.693367 00:01
8 0.954476 0.930218 0.705882 00:01
9 0.889936 0.864621 0.731539 00:01
10 0.815705 0.815270 0.740926 00:01
11 0.761109 0.770075 0.759700 00:01
12 0.719386 0.718789 0.777847 00:01
13 0.663675 0.683248 0.783479 00:01
14 0.603862 0.653833 0.793492 00:01
15 0.555469 0.619850 0.801627 00:01
16 0.506831 0.575025 0.820400 00:01
17 0.471152 0.554522 0.825407 00:01
18 0.433732 0.531214 0.836045 00:01
19 0.416024 0.520105 0.828536 00:01
20 0.362679 0.504946 0.831039 00:01
21 0.333553 0.481875 0.839800 00:01
22 0.313346 0.454316 0.848561 00:01
23 0.295104 0.459885 0.836671 00:01
24 0.260602 0.424484 0.864205 00:01
25 0.246882 0.449828 0.852941 00:01
26 0.212048 0.420795 0.857322 00:01
27 0.194751 0.419301 0.862328 00:01
28 0.193910 0.427121 0.860451 00:01
29 0.177843 0.416969 0.860451 00:01
30 0.161258 0.408608 0.875469 00:01
31 0.168606 0.395215 0.866083 00:01
32 0.144902 0.402985 0.866083 00:01
33 0.127127 0.395520 0.867334 00:01
34 0.130257 0.389912 0.882979 00:01
35 0.136157 0.402041 0.869837 00:01
36 0.128909 0.403417 0.877347 00:01
37 0.124474 0.409458 0.867334 00:01
38 0.112614 0.412239 0.881727 00:01
39 0.120688 0.421026 0.873592 00:01
40 0.101785 0.377680 0.886733 00:01
41 0.107204 0.402054 0.880476 00:01
42 0.104357 0.389439 0.877347 00:01
43 0.101453 0.407585 0.881101 00:01
44 0.088406 0.386060 0.884230 00:01
45 0.090783 0.396249 0.880476 00:01
46 0.082592 0.372472 0.879850 00:01
47 0.086191 0.373930 0.894243 00:01
48 0.081868 0.362440 0.889862 00:01
49 0.065329 0.384066 0.882353 00:01
50 0.083963 0.380745 0.889862 00:01
51 0.082650 0.417959 0.889862 00:01
52 0.075654 0.388021 0.892365 00:01
53 0.080464 0.403859 0.891740 00:01
54 0.063071 0.371227 0.891114 00:01
55 0.073778 0.353450 0.899249 00:01
56 0.073829 0.391176 0.884230 00:01
57 0.064231 0.365528 0.894243 00:01
58 0.062055 0.373422 0.895494 00:01
59 0.063687 0.391158 0.893617 00:01
60 0.061825 0.417160 0.882979 00:01
61 0.052974 0.394985 0.903004 00:01
62 0.063245 0.395809 0.887985 00:01
63 0.054693 0.358832 0.903004 00:01
64 0.052507 0.388410 0.895494 00:01
65 0.052654 0.352484 0.906758 00:01
66 0.051661 0.413555 0.892991 00:01
67 0.056109 0.394127 0.895494 00:01
68 0.056794 0.372550 0.903004 00:01
69 0.058300 0.374250 0.896746 00:01
70 0.046908 0.352735 0.909262 00:01
71 0.054862 0.368875 0.904881 00:01
72 0.042489 0.400974 0.891740 00:01
73 0.045906 0.384763 0.897372 00:01
74 0.053988 0.396204 0.896746 00:01
75 0.042612 0.384841 0.898623 00:01

And classification report results:

Class Precision Recall F1-Score Support
0 (blues) 0.94 0.89 0.91 163
1 (classical) 0.91 0.93 0.92 146
2 (country) 0.82 0.82 0.82 161
3 (disco) 0.84 0.93 0.88 168
4 (hiphop) 0.94 0.91 0.93 162
5 (jazz) 0.91 0.91 0.91 151
6 (metal) 0.95 0.94 0.94 172
7 (pop) 0.93 0.90 0.92 150
8 (reggae) 0.93 0.90 0.91 163
9 (rock) 0.82 0.86 0.84 162
Accuracy 0.90 1598
Macro Avg 0.90 0.90 0.90 1598
Weighted Avg 0.90 0.90 0.90 1598

Running the training a few more times yielded results between accuracies around 90%.

Image-based training

TODO

music-genre-recognition-gtzan's People

Contributors

mithgroth avatar

Stargazers

 avatar

Watchers

 avatar

music-genre-recognition-gtzan's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.