Coder Social home page Coder Social logo

cfa's Introduction

The R-package faircause can be used for performing Causal Fairness Analysis and implements the methods described in the paper Causal Fairness Analysis (Plecko & Bareinboim, 2022). We refer you to the manuscript for full theoretical details and the methodology. Below we offer quick installation instructions and show a worked example that can help the user get started.

Installation

You can install faircause from this Github repository by using the devtools package:

devtools::install_github("dplecko/faircause")

Please note that faircause is currently at its first version 0.0.0.9000, meaning that is has not yet been thoroughly tested. Any issues and bug reports are warmly welcomed and much appreciated.

Example

We show an example of how to use the faircause package on the US Government Census 2018 dataset collected by American Community Survey. The dataset contains information on 204,309 employees of the US government, including demographic information Z (age, race, location, citizenship), education and work related information W, and the yearly earnings Y. The protected attribute X we consider in this case is sex (x_1 male, x_0 female).

A data scientist analyzing the Census dataset observes the following:

library(faircause)

census <- head(faircause::gov_census, n = 20000L)
TV <- mean(census$salary[census$sex == "male"]) -
  mean(census$salary[census$sex == "female"])

TV
#> [1] 15053.69

In the first step the data scientist computed that the average disparity in the yearly salary measured by the TV is

 E[Y \mid x_1] - E[Y \mid x_0] = \$ 15053.

The data scientist has read the Causal Fairness Analysis paper and now wants to understand how this observed disparity relates to the underlying causal mechanisms that generated it. To this end, he constructs the Standard Fairness Model (see Plecko & Bareinboim, Definition 4) associated with this dataset:

X <- "sex" # protected attribute
Z <- c("age", "race", "hispanic_origin", "citizenship", "nativity", 
       "economic_region") # confounders
W <- c("marital", "family_size", "children", "education_level", "english_level", 
       "hours_worked", "weeks_worked", "occupation", "industry") # mediators
Y <- "salary" # outcome

Based on this causal structure of the variables, the data scientist now performs Causal Fairness Analysis by using the fairness_cookbook() function exported from the faircause package:

# decompose the total variation measure
set.seed(2022)
tvd <- fairness_cookbook(data = census, X = X, W = W, Z = Z, Y = Y, 
                         x0 = "female", x1 = "male")

# visualize the x-specific measures of direct, indirect, and spurious effect
autoplot(tvd, decompose = "xspec", dataset = "Census 2018")

The data scientist concludes that there is a substantial cancellation of the direct, indirect effects, namely:

  • the direct effect explains $10,300 of the observed disparity (that is, females would be paid more, had they been male in this case)
  • the indirect effect accounts for -$6,400 (cancelling out with the direct effect)
  • the spurious effect accounts for $1,000 of the observed variation

In particular, the dataset might show evidence of disparate treatment, which needs further investigation.

cfa's People

Contributors

dplecko avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.