Coder Social home page Coder Social logo

carla-recourse / carla Goto Github PK

View Code? Open in Web Editor NEW
263.0 6.0 60.0 1.93 MB

CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

License: MIT License

Makefile 0.31% Python 99.69%
python machine-learning artificial-intelligence explainable-ai explainable-ml explainability counterfactual-explanations counterfactuals counterfactual recourse

carla's Introduction

PyPI - Python Version GitHub Workflow Status Read the Docs Code style: black

CARLA - Counterfactual And Recourse Library

CARLA is a python library to benchmark counterfactual explanation and recourse models. It comes out-of-the box with commonly used datasets and various machine learning models. Designed with extensibility in mind: Easily include your own counterfactual methods, new machine learning models or other datasets. Find extensive documentation here! Our arXiv paper can be found here.

What is algorithmic recourse? As machine learning (ML) models are increasingly being deployed in high-stakes applications, there has been growing interest in providing recourse to individuals adversely impacted by model predictions (e.g., below we depict the canonical recourse example for an applicant whose loan has been denied). This library provides a starting point for researchers and practitioners alike, who wish to understand the inner workings of various counterfactual explanation and recourse methods and their underlying assumptions that went into the design of these methods.

motivating example

Notebooks / Examples

  • Getting Started (notebook): Source
  • Causal Recourse (notebook): Source
  • Plotting (notebook): Source
  • Benchmarking (notebook): Source
  • Adding your own Data: Source
  • Adding your own ML-Model: Source
  • Adding your own Recourse Method: Source

Available Datasets

Name Source
Adult Source
COMPAS Source
Give Me Some Credit Source
HELOC Source

Provided Machine Learning Models

Model Description Tensorflow Pytorch Sklearn XGBoost
ANN Artificial Neural Network with 2 hidden layers and ReLU activation function. X X
LR Linear Model with no hidden layer and no activation function. X X
RandomForest Tree Ensemble Model. X
XGBoost Gradient boosting. X

Implemented Counterfactual methods

The framework a counterfactual method currently works with is dependent on its underlying implementation. It is planned to make all recourse methods available for all ML frameworks . The latest state can be found here:

Recourse Method Paper Tensorflow Pytorch SKlearn XGBoost
Actionable Recourse (AR) Source X X
Causal Recourse Source X X
CCHVAE Source X
Contrastive Explanations Method (CEM) Source X
Counterfactual Latent Uncertainty Explanations (CLUE) Source X
CRUDS Source X
Diverse Counterfactual Explanations (DiCE) Source X X
Feasible and Actionable Counterfactual Explanations (FACE) Source X X
FeatureTweak Source X X
FOCUS Source X X
Growing Spheres (GS) Source X X
Revise Source X
Wachter Source X

Installation

Requirements

  • python3.7
  • pip

Install via pip

pip install carla-recourse

Quickstart

from carla import OnlineCatalog, MLModelCatalog
from carla.recourse_methods import GrowingSpheres

# load a catalog dataset
data_name = "adult"
dataset = OnlineCatalog(data_name)

# load artificial neural network from catalog
model = MLModelCatalog(dataset, "ann")

# get factuals from the data to generate counterfactual examples
factuals = dataset.raw.iloc[:10]

# load a recourse model and pass black box model
gs = GrowingSpheres(model)

# generate counterfactual examples
counterfactuals = gs.get_counterfactuals(factuals)

Contributing

Requirements

  • python3.7-venv (when not already shipped with python3.7)
  • Recommended: GNU Make

Installation

Using make:

make requirements

Using python directly or within activated virtual environment:

pip install -U pip setuptools wheel
pip install -e .

Testing

Using make:

make test

Using python directly or within activated virtual environment:

pip install -r requirements-dev.txt
python -m pytest test/*

Linting and Styling

We use pre-commit hooks within our build pipelines to enforce:

  • Python linting with flake8.
  • Python styling with black.

Install pre-commit with:

make install-dev

Using python directly or within activated virtual environment:

pip install -r requirements-dev.txt
pre-commit install

Licence

carla is under the MIT Licence. See the LICENCE for more details.

Citation

This project was recently accepted to NeurIPS 2021 (Benchmark & Data Sets Track). If you use this codebase, please cite:

@misc{pawelczyk2021carla,
      title={CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms},
      author={Martin Pawelczyk and Sascha Bielawski and Johannes van den Heuvel and Tobias Richter and Gjergji Kasneci},
      year={2021},
      eprint={2108.00783},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Please also cite the original authors' work.

carla's People

Contributors

ah-ansari avatar aredelmeier avatar indyfree avatar johanvandenheuvel avatar philoso-fish avatar voulgaris-sot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

carla's Issues

yNN computation

Hi!
Just a quick clarification question about the code used to calculate yNN. I noticed that the the labels of the closest neighbours (neighbour_label) in nearest_neighbours.py is compared to something called 'cf_label' which is defined as

cf_label = row[mlmodel.data.target]

If I'm not mistaken, cf_label is then the TRUE label of the test observation (not the predicted label).

This seems to be different from what is written under 4.2 "yNN" in the arXiv paper. Do you think it should be the predicted label and not the TRUE label that should be compared instead?

Thanks for your help.

Loading ML model from catalog if directory already exists

For example, if someone loads first the tensorflow ann model into cache at '../Users/xxx/carla/models/ann/adult/ann.h5' then it is not possible to load the pytorch model as ann.pt into the same directory.

Even the directory already exists, the if-condition
if not os.path.exists(cache_path):
would not recognize it, since the new file ann.pt is not contained. Creating the path would lead to an error.

Refactor D2 and D3 distances

We want the functions to be immutable & deterministc and not enforce the caller to have to use dataframes.

  • Calculate the range before calling the d2 and d3 distances and pass as parameter
  • Don't remove target labels within distance function. These functions should just calculate distances between any two lists/arrays
  • Write unit tests to test with empty and 1-element inputs

Put CARLA on PyPi

  • Remove all print statements and replace with logger
  • Update setup.py with meta information
  • Create setup.cfg that links README
  • Add __version__ to package
  • Publish as package to github
  • Upload to PyPi
  • Put last 2 steps into CI/CD pipeline
  • Optional: Automatically increase version

Also check out this guide: https://realpython.com/pypi-publish-python-package/

Improve Face method

Improve the graph search method of Face, such that it can deal with continuous features.

However, authors also do not mention how to go about it

Questions about parameters used in CARLA paper

Thank you for all the work you put into the CARLA package. It's a great help when trying to compare different counterfactual methods! My questions pertain the parameters/specifications used in the CARLA paper.

  1. What was the train/test split for the ann model? Do you have a way to access the train/test data in the package so that I can fit my own model with the same data?

  2. In the paper, the adult data set seems to fix the features age, sex, and race. However, in the dataset class, Datacatalog("adult").immutables gives only "age" and "sex" as the immutables. Can you comment on whether there was a change since the simulations in the paper were run or whether the default in the package is wrong?

  3. I am getting an error when running the cem/cem-vae models. When running:

mlmodel = MLModelCatalog(dataset, "ann", backend)
CEM(factuals, mlmodel, hyperparams)

where hyperparams comes from the experimental_setup.yaml script, I get the following error:

ValueError: For hidden_layer is no default value defined, please pass this key and its value in hyperparams

Changing the hyperparameters "ae_params" dictionary to:
hyperparams["ae_params"] = {'hidden_layer': [20, 10, 7], 'train_ae': True, 'epochs': 5}

does the trick. Can you let me know if this is the same hidden_layer dimension used in the paper?

  1. The give_me_some_credit data set models the response "SeriousDlqin2yrs" which is a negative response. However, the predict_negative_instances() function returns factuals as those with a predicted probability of less than 0.5 (specifically 1675 rows). I would argue that these are the positive instances and the negative instances have a probability of lower than 0.5. Do you have any comment about this?

  2. Finally, are the results in Table 2 of the paper based on all factuals with a predicted probability less than 0.5 (i.e., 1675 rows for give_me_some_credit and 39954 rows for adult)?

Thanks for all of your help!
Annabelle

Add save path for experiment output

Until now, every call of run_experiment saves its results in cache. It would be good if we could define an argpars for individual save paths.

Use typehinting

  • Use typehinting for method parameter
  • Add automatic checks for typehinting in workflow

Loading Pytorch model

It is currently not possible to load a pytorch model which was saved in the banchmarking repsitory.

To load such a model it is necessary to have access to the original class, as discussed here and here.

Loading it without access to the class causes the error ModuleNotFound Error: No module named 'ML_Model'

MLModelCatalog predict method incompatible with pipeline

The MLModelCatalog predict method has the following signature:

def predict(
    self, x: Union[np.ndarray, pd.DataFrame, torch.Tensor, tf.Tensor]
) -> Union[np.ndarray, pd.DataFrame, torch.Tensor, tf.Tensor]:

however if the MLModelCatalog pipeline is enabled then x is also input for

def perform_pipeline(self, df: pd.DataFrame) -> pd.DataFrame:

i.e. the predict function can take input types that are incompatible with the possible model settings.

Write AE model structure and training

Keep the model structure and training of AE inside CARLA and train a model, if it is needed inside a recourse method. After Training, the model is saved in our cache and can be loaded from there, if the recourse method is called another time.

The training doesn't take long, so we don't need to keep trained models in a repository.

Add documentation

Construct a documentation page with our docstring via Sphinx and integrate it via Github Pages

Would NAMs be another interesting model to add?

We recently published Neural Additive Models (NAMs) at NeurIPS 2021 which combines the interpretability of generalized additive models with neural nets (see the architecture shown below). We also evaluated NAMs on some of the dataset here including COMPAS and adult. Do you think they'd make for a good addition in this repo too?

The source code for NAMs is open-sourced in tensorflow (official version) and pytorch!
For a quick summary of NAMs, look at this thread.

image

Automate docs building process

With this one: #81 we have documentation hosted on read-the-docs, now we want an automated build that pushes to it. According to @Philoso-Fish requires a github webhook that can be retrieved via the admin interface

Restructure data api/ catalog

Given the problem between correct normalization, encoding, and feature order, which is specific to a certain, arbitrary black-box model, we need to have a setter method for class properties encoded, normalized, and encoded_normalized.

This method can use the black-box model and its pipeline as input to build the required dataframes.

Fix output for CEM

The current output of CEM will not work with the benchmarking process.

  • Refactor output of CEM
  • Integrate in benchmarking process

Autoencoder training for CEM

The current implementation of CEM does not allow to train an autoencoder while calling it.
This leads to errors when no pretrained autoencoder is available.

  • Add training inside constructor
  • Add training flag in hyperparameter
  • Add tests

Inaccurate Bug in constraint_violation Check

Hi,

When I tried to use the benchmark to test the constraint violations, I find an inaccurate casting in violations.py Line 34. When casting a float to an int using astype, we can get some inaccurate results which will break the constraint_violation check. Instead, we should use round before typecasting.

For example:

df_decoded_cfs[model.data.continous] = pd.DataFrame.round(df_decoded_cfs[
    model.data.continous
]).astype(
    "int64"
) 

Best practice for "Imputers"

What about imputers?

Do you have any recommendation or best practice for address them?

  • When they need to be applied for training data as well before to produce predictions.
  • When they need to be distinct for categorical and continuous values.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.