Coder Social home page Coder Social logo

mkupisie / clustering-geodemographic_classification_of_nyc_using_k-means_geopandas_sklearn Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.28 MB

Conducting geodemographic classification for ethnic groups in NYC using K-means algorithm available in sklearn.cluster module.

Jupyter Notebook 100.00%
geopandas pysal k-means matplotlib-pyplot nyc-open-data sklearn-classify

clustering-geodemographic_classification_of_nyc_using_k-means_geopandas_sklearn's Introduction

Clustering: geodemographic classification of NYC using K-means algorithm

Many questions related to spatial observations are complex phenomena that involves several dimensions, what make it hard to summarize them into a single variable. It is especially visible when trying to map distribution of people taking into account e.g. their nationality, education level, age etc. It's uncommon for a geographic region to be exclusively populated by individuals of identical heritage, particularly in the context of urban areas.

Clustering can be used to reduce the dimensionality - the number of variables the analyst needs to look at - and converting it into a more intuitive set of classes. The fundamental concept behind statistical clustering is to condense the information from multiple variables into a relatively small number of categories. Subsequently, each entry in the dataset is assigned exclusively to one category, based on its values for the initially considered variables.**

K-means is one of the most popular clustering algorithm and it can be run in python using sklearn.cluster module in scikit-learn (a popular machine learning library in Python).

Data

For the purpose of classification the data available in pysal library in examples package were used (https://pysal.org/notebooks/lib/libpysal/Example_Datasets.html).

NYC Socio-Demographics data contains the information of total population of the following groups:

  • european: Total Population White
  • asian: Total Population Asian American
  • american: Total Population American Indian
  • african: Total Population African American
  • hispanic: Total Population Hispanic
  • mixed: Total Population Mixed race
  • pacific: Total Population Pacific Islander

Results

1.1. Classification on the map

NYC_results

1.2. The mean of total population for each ethnic group within the class

table_results

Based on the results above it can be noticed that within each group there is a strong majority of one, two and sometimes three ethnic groups:

  • class 0: hispanic and african
  • class 1: european
  • class 2: african
  • class 3: hispanic and european
  • class 4: european and asian
  • class 5: asian
  • class 6: african, european and hispanic
  • class 7: no population
  • class 8: european
  • class 9: african and hispanic

clustering-geodemographic_classification_of_nyc_using_k-means_geopandas_sklearn's People

Contributors

mkupisie avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.