Coder Social home page Coder Social logo

maha-prathamesh / clustering-geolocation-data Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 1.21 MB

Taking Taxi rank location data for Johannesburg, South Africa and clustering them geographically optimally, so that we can build service stations for all taxi ranks in that cluster.

Jupyter Notebook 79.62% HTML 20.38%
clustering-data taxi-ranks hdbscan kmeans-clustering dbscan

clustering-geolocation-data's Introduction

Clustering-Geolocation-Data-Intelligently-in-Python

Taking Taxi rank location data for Johannesburg, South Africa and clustering them geographically optimally, so that we can build service stations for all taxi ranks in that cluster.

Pre-requisites To be able to understand to understand the code and perform the task, a basic knowledge of the following topics is assumed.

  1. Basic Matplotlib skills for plotting 2-D data clearly.
  2. Basic understanding of Pandas and how to use it for data manipulation.
  3. The basic concepts behind clustering algorithms. We will be working with K-Means, DBSCAN and HDBSCAN.

Outline

We will divide the project into 7 parts.

  1. Exploratory Data Analysis: Do some data cleaning and initial visualizations to get a sense of the data.
  2. Visualizing Geographical Data: Plot the data onto a geospatial map. Will use the folium library for this.
  3. Clustering Strength / Performance Metric: Work with dummy data to understand how clustering (K-Means) works. Will explore the influence of number of clusters on performance.
  4. K-Means Clustering: Clustering data using K-Means and evaluating the clusters formed.
  5. DBSCAN: Density-Based Spatial Clustering of Applications with Noise.
  6. HDBSCAN: Hierarchal Density-Based Spatial Clustering of Applications with Noise.
  7. Addressing Outliers: Address outliers from HDBSCAN (also called as singletons) and see how we can assign them.

Github doesn't support Map visualization. So view the 'Project Final.ipynb' file with geo spatial visualizations click here: https://github.com/maha-prathamesh/Clustering-Geolocation-Data-in-Python/blob/main/Project%20Final.ipynb

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.