datamining-project's Introduction

Data Mining Class Project

Collaborators

Angel Dhungana, Avram Twitchell

Assignment Overview

A collaborative for project for CS 6140 for Spring 2019. The objective is to perform data mining on some real data set, with the goal to gain in depth experience in some aspect of the class, in a setting where the instructor can give guidance. The student is meant to demonstrate a deep understanding of some aspect of data mining.

Data

We intend to apply data mining on the Million Song dataset. Specifically, we plan to data mine the lyrics, musical characteristics (e.g. tempo), artist, year, and genres data.

Structure

In musical criticism, there are commonly accepted narratives as to how certain artists or genres influence each other. We want to see if there exists a quantifiable structure to these influences. We will do this by examining similarities in lyrics and musical characteristics such as tempo, to see if relationships exist between artists, genres, and the year, that coincides with these commonly accepted narratives.

Motivation

This problem is interesting on a few different fronts, but we will be focusing on two.

First, this examination can potentially give insight on how human beings interact, collaborate, borrow, steal, or draw inspiration from each other.

Second, it may offer insights on the evolution of music throughout the years. Using the time data, we could potentially see how these things shift.

Running on the subset

Install requirements and create data directories by running make setup
Download the subset by running make download_subset
run make agg-data to move files to raw_data
Run clustering by using `python3 run.py run k [n...]

Recommend Projects

z-mu-z / datamining-project Goto Github PK

datamining-project's Introduction

Data Mining Class Project

Collaborators

Assignment Overview

Data

Structure

Motivation

Running on the subset

datamining-project's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent