Identify customer segments with Arvato

This project is a part of Udacity Data Science Nanodegree. In this project, AZ Direct and Arvato Financial Solutions have provided two datasets one with demographic information about the people of Germany, and one with that same information for customers of a mail-order sales company. the goal is to apply unsupervised learning techniques on demographic. We preprocess the data, apply dimensionality reduction techniques, and implement clustering algorithms to segment customers to optimize customer outreach for a mail-order company.

We look at relationships between demographics features, organize the population into clusters, and see how prevalent customers are in each of the segments obtained.

Note that the data provided by Arvato Financial Services is not public and we shouldn't upload it due to the terms and conditions. In the notebook, you can find the code for the analysis that has been preformed on the datasets.

Installation

This project requires Python 3 and the following Python libraries installed:

File Descriptions

Identify_Customer_Segments.ipynb

Jupyter Notebook divided into sections for completing the project. The notebook provides more details than the outline given here.

Data

Udacity_AZDIAS_Subset.csv: Demographics data for the general population of Germany; 891211 persons (rows) x 85 features (columns).
Udacity_CUSTOMERS_Subset.csv: Demographics data for customers of a mail-order company; 191652 persons (rows) x 85 features (columns).
Data_Dictionary.md: Detailed information file about the features in the provided datasets.
AZDIAS_Feature_Summary.csv: Summary of feature attributes for demographics data; 85 features (rows) x 4 columns

Each row of the demographics files represents a single person, but also includes information outside of individuals, including information about their household, building, and neighborhood. We use this information to cluster the general population into groups with similar demographic properties. Then, we see how the people in the customers' dataset fit into those created clusters. The hope here is that certain clusters are over-represented in the customers' data, as compared to the general population; those over-represented clusters will be assumed to be part of the core userbase. This information can then be used for further applications, such as targeting for a marketing campaign.

mhomaid1 / identify-customer-segments Goto Github PK

identify-customer-segments's Introduction

Identify customer segments with Arvato

Installation

File Descriptions

Data

identify-customer-segments's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent