Coder Social home page Coder Social logo

lab-not-hot-songs's Introduction

Ironhack logo

Lab | Not hot songs

Introduction

Now that you have scrapped the website Billboard to create a hot_songs dataset, it's time to prepare a new dataset of not_hot_songs. This dataset can contain songs of your choice, others collected from the web or any other combination. Some sources of songs can be:

Considerations

You want your dataset of not_hot_song to be:

  • As heterogeneous in terms of (genre, length,...etc) as possible to create better groups of songs.
  • Not too big and not too small (typically around 2-3K) songs

In a real-life scenario, you might want to have your dataset as biggest as possible and use specialized Big Data techniques like PySpark to group similar songs together. However, you are going to work on your own laptop which has limited power. Therefore, you need to limit the size of your dataset of not_hot_songs otherwise the process of grouping similar songs will take forever.

Deliverables

Your fork should contain a jupyter notebook with the code to:

  • Gather the songs
  • Remove songs already present in the hot_songs dataset

lab-not-hot-songs's People

Contributors

isg75 avatar olivereves avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.