Coder Social home page Coder Social logo

socioeconomic-index's Introduction

Introduction

A socioeconomic index, also known as a deprivation index, is a single numerical figure that gauges the socioeconomic status of a predefined area. It encompasses multiple socioeconomic characteristics as well as their relative significance. It would allow for direct comparisons of socioeconomic status between regions and would be tremendously useful in identifying patterns and correlations between socioeconomic status and other attributes. This is our attempt at creating a socioeconomic index for Sri Lanka.

Dataset

The dataset we used was the 2011 national census datasets. This repository contains a cleaned version of these datasets. Below is a thorough description of the datasets and their respective name in the code.

Dataset Category Variable Name in Code
Household Cooking Fuel Firewood
Kerosene
Gas
Electricity
Sawdust / Paddy husk
Other
coo_firewood
coo_kerosene
coo_gas
coo_electricity
coo_sawdust_paddyhusk
coo_other
Household Floor Material Cement
Tile / Granite / Terrazzo
Mud
Wood
Sand
Concrete
Other
flo_cement
flo_tile_granite_terrazzo
flo_mud
flo_wood
flo_sand
flo_concrete
flo_other
Household Housing Permanent
Semi-permanent
Improvised
Unclassified
hou_permanent
hou_semipermanent
hou_improvised
hou_unclassified
Household Lighting National Grid
Hydro Power
Kerosene
Solar Power
Biogas
Other
lig_nationalgrid
lig_hydro
lig_kerosene
lig_solar
lig_biogas
lig_other
Household Roof Material Tile
Asbestos
Concrete
Zinc / Aluminium sheet
Metal sheet
Cadjan / Palmyrah / Straw
Other
roo_tile
roo_asbestos
roo_concrete
roo_zinc_aluminium
roo_metal
roo_cadjan_palmyrah_straw
roo_other
Household Structure Single - 1 story
Single - 2 story
Single - 3+ story
Attached house / Annex
Flat
Condominium
Twin house
Row / Line room
Hut / Shanty
str_single_1
str_single_2
str_single_3
str_attachedhouse_annex
str_flat
str_condominium
str_twinhouse
str_room
str_hut_shanty
Household Tenure Owned
Rent / Lease - government owned
Rent / Lease - private owned
Rent free
Encroached
Other
ten_owned
ten_rent_gov
ten_rent_pvt
ten_rent_free
ten_encroached
ten_other
Household Toilet Facilities Water Seal - connected to sewer
Water Seal - connected to septic tank
Pour flush
Direct pit
Other
No toilet
toi_waterseal_sewer
toi_waterseal_tank
toi_pourflush
toi_directpit
toi_other
toi_none
Household Wall Material Brick
Cement block / Stone
Cabook
Soil brick
Mud
Cadjan / Palmyrah
Plank / Metal sheet
Other
wal_brick
wal_cementblock_stone
wal_cabook
wal_soilbrick
wal_mud
wal_cadjan_palmyrah
wal_plank_metal
wal_other
Household Waste Disposal Collected by government
Burned
Buried
Composted
Disposed into environment
Other
was_collect_gov
was_burn
was_bury
was_compost
was_dispose_env
was_other
Household Water Source Protected well - within premises
Protected well - outside premises
Unprotected well
Tap - within unit
Tap - outside unit but within premises
Tap - outside premises
Rural water projects
Tube well
Bowser
River / Tank / Stream
Rain water
Bottled water
Other
wat_well_prot_in
wat_well_prot_out
wat_well_unprot
wat_tap_unit_in
wat_tap_prem_in
wat_tap_prem_out
wat_rural
wat_tubewell
wat_bowser
wat_river_tank_stream
wat_rain
wat_bottled
wat_other
Population Age 0 - 4
5 - 9
10 - 14
15 - 19
20 - 24
25 - 29
30 - 34
35 - 39
40 - 44
45 - 49
50 - 54
55 - 59
60 - 64
65 - 69
70 - 74
75 - 79
80 - 84
85 - 89
90 - 94
95 & above
age_y0_4
age_y5_9
age_y10_14
age_y15_19
age_y20_24
age_y25_29
age_y30_34
age_y35_39
age_y40_44
age_y45_49
age_y50_54
age_y55_59
age_y60_64
age_y65_69
age_y70_74
age_y75_79
age_y80_84
age_y85_89
age_y90_94
age_y95_above
Population Education Primary
Secondary
O Level
A Level
Degree & Above
No Schooling
edu_primary
edu_secondary
edu_olevel
edu_alevel
edu_degree
edu_none
Population Employment Employed
Unemployed
Economically Inactive
emp_employed
emp_unemployed
emp_inactive
Population Gender Male
Female
gen_male
gen_female

Note: The 2011 national census datasets are only available as a summary of counts at the Grama Niladhari Division (GND) level. The original categorical variables surveyed at the household level have been converted to binary variables and aggregated for each GND. This obscures certain correlations between variables, therefore our results are suboptimal.

Methodology

We employed principal component analysis (PCA) and extracted the first principal component to use as the socioeconomic index. We strongly recommend reading Vyas and Kumaranayake (2006) for a thorough justification of this method as well as an exploration of alternatives.

This whitepaper contains a thorough description of our process. In short, we observed the following procedure:

  1. Curate the dataset to remove variables that are either redundant or non-indicative of socioeconomic status.
  2. Normalize the dataset with respect to each category within each GND.
  3. Standardize each variable.
  4. Run PCA on the standardized dataset.
  5. Extract the weights given by the first principal component.
  6. Multiply the standardized dataset by these weights.
  7. Sum the above scores for each GND to get the socioeconomic index.

Results

We separated the GNDs into seven quantiles and plotted their socioeconomic index as a choropleth map. These are our results using the household and population datasets.

Combined Dataset

Socioeconomic Index using the combined dataset

Household Dataset

Socioeconomic Index using the household dataset

Population Dataset

Socioeconomic Index using the population dataset

socioeconomic-index's People

Contributors

virendias avatar

Watchers

Kaushalya Madhawa avatar James Cloos avatar Danaja Maldeniya avatar Srinath Perera avatar Nisansa de Silva avatar Manju Lasantha Fernando avatar  avatar Joshua Blumenstock avatar  avatar Gabriel E Kreindler avatar Yuhei Miyauchi avatar CD Athuraliya avatar Madhushi Bandara avatar  avatar Yudhanjaya Wijeratne avatar  Keshan De Silva avatar

Forkers

vajiral

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.