Coder Social home page Coder Social logo

prajwal10031999 / song-genre-classification-in-pysparks-mllib Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 2.0 1.6 MB

A PySpark MLlib classification model to classify songs based on a number of characteristics into a set of 23 electronic genres.

Jupyter Notebook 100.00%
pyspark-mllib pyspark-notebook pyspark-machine-learning machine-learning bigdata musicgenre genre-classification python mllib jupyter-notebook

song-genre-classification-in-pysparks-mllib's Introduction

Song-Genre-Classification-in-PySparks-MLlib

A PySpark MLlib classification model to classify songs based on a number of characteristics into a set of 23 electronic genres.
This technology could be used by an application like Pandora to recommend songs to users or just create meaningful channels. Super fun!

Dataset
Each row is an electronic music song. The dataset contains 100 song for each genre among 23 electronic music genres, they were the top (100) songs of their genres on November 2016. The 71 columns are audio features extracted of a two random minutes sample of the file audio. These features have been extracted using pyAudioAnalysis.

Firstly, I created an algorithm that classifies songs into the 23 genres provided. Then I tested out several different models and select the highest performing one. Also I played around with the feature selection methods and finally tried to make a recommendation for any user.

My approach

I decided to approach this analysis in 4 main steps.
1] Create Baseline: Train and evaluate models on raw data without pre-treating it for outliers, skewness or negative values. This way we can clearly see what effect our transformations have on our analysis.

2] Test treatments: Train and evaluate models on treated data (outliers, skewness and negative values) and compare to baseline.

3] Feature Selection: Select the best performing models from the previous two approaches and perform feature selection on it to fine tune it.

4] Make a recommendation to a user: Create a scrip to make a recommendation to a user. I intentionally left this part of the project a bit ambiguous.

Screenshot Screenshot Screenshot Screenshot

song-genre-classification-in-pysparks-mllib's People

Contributors

prajwal10031999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

song-genre-classification-in-pysparks-mllib's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.