Coder Social home page Coder Social logo

ailabteam / unsupervised-ml-modelling-for-segmentation Goto Github PK

View Code? Open in Web Editor NEW

This project forked from idb-for-datascience/unsupervised-ml-modelling-for-segmentation

0.0 0.0 0.0 1.7 MB

Python Implementation of multiple unsupervised segmentation models and evaluating them through multiple evaluation metrics

License: MIT License

Jupyter Notebook 100.00%

unsupervised-ml-modelling-for-segmentation's Introduction

Segmentation-Modelling

Python Implementation of multiple unsupervised segmentation models and evaluating them through multiple evaluation metrics

In the current age, the availability of granular data for a large pool of customers/products and technological capability to handle petabytes of data efficiently is growing rapidly. Due to this, it’s now possible to come up with very strategic and meaningful clusters for effective targeting. And identifying the target segments requires a robust segmentation exercise. In this blog, we will be discussing the most popular algorithms for unsupervised clustering algorithms and how to implement them in python. In this blog, we will be working with clickstream data from an online store offering clothing for pregnant women. It includes variables like product category, location of the photo on the webpage, country of origin of the IP address and product price in US dollars. It has data from April 2008 to August 2008. The first step is to prepare the data for segmentation. I encourage you to check out this article for an in-depth explanation of different steps for preparing data for segmentation before proceeding further:

One Hot Encoding, Standardization, PCA: Data preparation for segmentation in python

Selecting the optimal number of clusters is another key concept one should be aware of while dealing with a segmentation problem. It will be helpful if you read the article below for understanding a comprehensive list of popular metrics for selecting clusters:

Cheatsheet for implementing 7 methods for selecting the optimal number of clusters in Python

We will be dealing with 4 categories of models in this code:

  1. K-means
  2. Agglomerative clustering
  3. Density-based spatial clustering (DBSCAN)
  4. Gaussian Mixture Modelling (GMM)

unsupervised-ml-modelling-for-segmentation's People

Contributors

idb-for-datascience avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.