Coder Social home page Coder Social logo

ibm / infairness Goto Github PK

View Code? Open in Web Editor NEW
60.0 7.0 7.0 6.37 MB

PyTorch package to train and audit ML models for Individual Fairness

Home Page: https://ibm.github.io/inFairness

License: Apache License 2.0

Python 100.00%
individual-fairness machine-learning pytorch fairness-ai trustworthy-machine-learning fairness responsible-ai

infairness's Introduction


Total downloads for the project

Individual Fairness and inFairness

Intuitively, an individually fair Machine Learning (ML) model treats similar inputs similarly. Formally, the leading notion of individual fairness is metric fairness (Dwork et al., 2011); it requires:

$$ d_y (h(x_1), h(x_2)) \leq L d_x(x_1, x_2) \quad \forall \quad x_1, x_2 \in X $$

Here, $h: X \rightarrow Y$ is a ML model, where $X$ and $Y$ are input and output spaces; $d_x$ and $d_y$ are metrics on the input and output spaces, and $L \geq 0$ is a Lipchitz constant. This constrained optimization equation states that the distance between the model predictions for inputs $x_1$ and $x_2$ is upper-bounded by the fair distance between the inputs $x_1$ and $x_2$. Here, the fair metric $d_x$ encodes our intuition of which samples should be treated similarly by the ML model, and in designing so, we ensure that for input samples considered similar by the fair metric $d_x$, the model outputs will be similar as well.

inFairness is a PyTorch package that supports auditing, training, and post-processing ML models for individual fairness. At its core, the library implements the key components of individual fairness pipeline: $d_x$ - distance in the input space, $d_y$ - distance in the output space, and the learning algorithms to optimize for the equation above.

For an in-depth tutorial of Individual Fairness and the inFairness package, please watch this tutorial. Also, take a look at the examples folder for illustrative use-cases and try the Fairness Playground demo. For more group fairness examples see AIF360.

Watch the tutorial

Installation

inFairness can be installed using pip:

pip install inFairness

Alternatively, if you wish to install the latest development version, you can install directly by cloning this repository:

git clone <git repo url>
cd inFairness
pip install -e .

Features

inFairness currently supports:

  1. Learning individually fair metrics : [Docs]
  2. Training of individually fair models : [Docs]
  3. Auditing pre-trained ML models for individual fairness : [Docs]
  4. Post-processing for Individual Fairness : [Docs]
  5. Individually fair ranking : [Docs]

Contributing

We welcome contributions from the community in any form - whether it is through the contribution of a new fair algorithm, fair metric, a new use-case, or simply reporting an issue or enhancement in the package. To contribute code to the package, please follow the following steps:

  1. Clone this git repository to your local system
  2. Setup your system by installing dependencies as: pip3 install -r requirements.txt and pip3 install -r build_requirements.txt
  3. Add your code contribution to the package. Please refer to the inFairness folder for an overview of the directory structure
  4. Add appropriate unit tests in the tests folder
  5. Once you are ready to commit code, check for the following:
    1. Coding style compliance using: flake8 inFairness/. This command will list all stylistic violations found in the code. Please try to fix as much as you can
    2. Ensure all the test cases pass using: coverage run --source inFairness -m pytest tests/. All unit tests need to pass to be able merge code in the package.
  6. Finally, commit your code and raise a Pull Request.

Tutorials

The examples folder contains tutorials from different fields illustrating how to use the package.

Minimal example

First, you need to import the relevant packages

from inFairness import distances
from inFairness.fairalgo import SenSeI

The inFairness.distances module implements various distance metrics on the input and the output spaces, and the inFairness.fairalgo implements various individually fair learning algorithms with SenSeI being one particular algorithm.

Thereafter, we instantiate and fit the distance metrics on the training data, and

distance_x = distances.SVDSensitiveSubspaceDistance()
distance_y = distances.EuclideanDistance()

distance_x.fit(X_train=data, n_components=50)

# Finally instantiate the fair algorithm
fairalgo = SenSeI(network, distance_x, distance_y, lossfn, rho=1.0, eps=1e-3, lr=0.01, auditor_nsteps=100, auditor_lr=0.1)

Finally, you can train the fairalgo as you would train your standard PyTorch deep neural network:

fairalgo.train()

for epoch in range(EPOCHS):
    for x, y in train_dl:
        optimizer.zero_grad()
        result = fairalgo(x, y)
        result.loss.backward()
        optimizer.step()

Authors


Mikhail Yurochkin

Mayank Agarwal

Aldo Pareja

Onkar Bhardwaj

infairness's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

infairness's Issues

issue in

Hi,

I was trying to run the basic example provided here: https://hub.gke2.mybinder.org/user/ibm-infairness-h9o8a789/doc/workspaces/auto-6/tree/examples/adult-income-prediction/adult_income_prediction.ipynb
However I am running into issue :

TypeError Traceback (most recent call last)
Input In [10], in <cell line: 8>()
5 distance_x = distances.LogisticRegSensitiveSubspace()
6 distance_y = distances.SquaredEuclideanDistance()
----> 8 distance_x.fit(X_train, protected_idxs)
9 distance_y.fit(num_dims=output_size)
11 distance_x.to(device)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/inFairness/distances/logistic_sensitive_subspace.py:77, in LogisticRegSensitiveSubspace.fit(self, data_X, data_SensitiveAttrs, protected_idxs, keep_protected_idxs, autoinfer_device)
42 """Fit Logistic Regression Sensitive Subspace distance metric
43
44 Parameters
(...)
73 on CPU.
74 """
76 if data_SensitiveAttrs is not None and protected_idxs is None:
---> 77 basis_vectors_ = self.compute_basis_vectors_data(
78 X_train=data_X, y_train=data_SensitiveAttrs
79 )
81 elif data_SensitiveAttrs is None and protected_idxs is not None:
82 basis_vectors_ = self.compute_basis_vectors_protected_idxs(
83 data_X,
84 protected_idxs=protected_idxs,
85 keep_protected_idxs=keep_protected_idxs,
86 )

File /srv/conda/envs/notebook/lib/python3.8/site-packages/inFairness/distances/logistic_sensitive_subspace.py:164, in LogisticRegSensitiveSubspace.compute_basis_vectors_data(self, X_train, y_train)
161 X_train = datautils.convert_tensor_to_numpy(X_train)
162 y_train = datautils.convert_tensor_to_numpy(y_train)
--> 164 self.assert_sensitiveattrs_binary(y_train)
166 basis_vectors_ = []
167 outdim = y_train.shape[-1]

File /srv/conda/envs/notebook/lib/python3.8/site-packages/inFairness/distances/logistic_sensitive_subspace.py:185, in LogisticRegSensitiveSubspace.assert_sensitiveattrs_binary(self, sensitive_attrs)
183 def assert_sensitiveattrs_binary(self, sensitive_attrs):
--> 185 assert validationutils.is_tensor_binary(
186 sensitive_attrs
187 ), "Sensitive attributes are required to be binary to learn the metric. Please binarize these attributes before fitting the metric."

File /srv/conda/envs/notebook/lib/python3.8/site-packages/inFairness/utils/validationutils.py:19, in is_tensor_binary(data)
5 """Checks if the data is binary (0/1) or not. Return True if it is binary data
6
7 Parameters
(...)
15 True if data is binary. False if not
16 """
18 nonbindata = (data != 0) & (data != 1)
---> 19 has_nonbin_data = True in nonbindata
20 return not has_nonbin_data

TypeError: argument of type 'bool' is not iterable

travis builds terminating

Travis CI builds terminating citing: "Job has been terminated due to insufficient credits balance". Investigate migrating to GitHub actions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.