maha-prathamesh / clustering-geolocation-data Goto Github PK

View Code? Open in Web Editor NEW

Taking Taxi rank location data for Johannesburg, South Africa and clustering them geographically optimally, so that we can build service stations for all taxi ranks in that cluster.

Jupyter Notebook 79.62% HTML 20.38%

clustering-geolocation-data's Introduction

Clustering-Geolocation-Data-Intelligently-in-Python

Taking Taxi rank location data for Johannesburg, South Africa and clustering them geographically optimally, so that we can build service stations for all taxi ranks in that cluster.

Pre-requisites To be able to understand to understand the code and perform the task, a basic knowledge of the following topics is assumed.

Basic Matplotlib skills for plotting 2-D data clearly.
Basic understanding of Pandas and how to use it for data manipulation.
The basic concepts behind clustering algorithms. We will be working with K-Means, DBSCAN and HDBSCAN.

Outline

We will divide the project into 7 parts.

Exploratory Data Analysis: Do some data cleaning and initial visualizations to get a sense of the data.
Visualizing Geographical Data: Plot the data onto a geospatial map. Will use the folium library for this.
Clustering Strength / Performance Metric: Work with dummy data to understand how clustering (K-Means) works. Will explore the influence of number of clusters on performance.
K-Means Clustering: Clustering data using K-Means and evaluating the clusters formed.
DBSCAN: Density-Based Spatial Clustering of Applications with Noise.
HDBSCAN: Hierarchal Density-Based Spatial Clustering of Applications with Noise.
Addressing Outliers: Address outliers from HDBSCAN (also called as singletons) and see how we can assign them.

Github doesn't support Map visualization. So view the 'Project Final.ipynb' file with geo spatial visualizations click here: https://github.com/maha-prathamesh/Clustering-Geolocation-Data-in-Python/blob/main/Project%20Final.ipynb

Recommend Projects

maha-prathamesh / clustering-geolocation-data Goto Github PK

clustering-geolocation-data's Introduction

Clustering-Geolocation-Data-Intelligently-in-Python

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent