carla-recourse / carla Goto Github PK

CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

License: MIT License

Makefile 0.31% Python 99.69%

python machine-learning artificial-intelligence explainable-ai explainable-ml explainability counterfactual-explanations counterfactuals counterfactual recourse

carla's Introduction

CARLA - Counterfactual And Recourse Library

CARLA is a python library to benchmark counterfactual explanation and recourse models. It comes out-of-the box with commonly used datasets and various machine learning models. Designed with extensibility in mind: Easily include your own counterfactual methods, new machine learning models or other datasets. Find extensive documentation here! Our arXiv paper can be found here.

What is algorithmic recourse? As machine learning (ML) models are increasingly being deployed in high-stakes applications, there has been growing interest in providing recourse to individuals adversely impacted by model predictions (e.g., below we depict the canonical recourse example for an applicant whose loan has been denied). This library provides a starting point for researchers and practitioners alike, who wish to understand the inner workings of various counterfactual explanation and recourse methods and their underlying assumptions that went into the design of these methods.

Notebooks / Examples

Getting Started (notebook): Source
Causal Recourse (notebook): Source
Plotting (notebook): Source
Benchmarking (notebook): Source
Adding your own Data: Source
Adding your own ML-Model: Source
Adding your own Recourse Method: Source

Available Datasets

Name	Source
Adult	Source
COMPAS	Source
Give Me Some Credit	Source
HELOC	Source

Provided Machine Learning Models

Model	Description	Tensorflow	Pytorch	Sklearn	XGBoost
ANN	Artificial Neural Network with 2 hidden layers and ReLU activation function.	X	X
LR	Linear Model with no hidden layer and no activation function.	X	X
RandomForest	Tree Ensemble Model.			X
XGBoost	Gradient boosting.				X

Implemented Counterfactual methods

The framework a counterfactual method currently works with is dependent on its underlying implementation. It is planned to make all recourse methods available for all ML frameworks . The latest state can be found here:

Recourse Method	Paper	Tensorflow	Pytorch	SKlearn	XGBoost
Actionable Recourse (AR)	Source	X	X
Causal Recourse	Source	X	X
CCHVAE	Source		X
Contrastive Explanations Method (CEM)	Source	X
Counterfactual Latent Uncertainty Explanations (CLUE)	Source		X
CRUDS	Source		X
Diverse Counterfactual Explanations (DiCE)	Source	X	X
Feasible and Actionable Counterfactual Explanations (FACE)	Source	X	X
FeatureTweak	Source			X	X
FOCUS	Source			X	X
Growing Spheres (GS)	Source	X	X
Revise	Source		X
Wachter	Source		X

Installation

Requirements

python3.7
pip

Install via pip

pip install carla-recourse

Quickstart

from carla import OnlineCatalog, MLModelCatalog
from carla.recourse_methods import GrowingSpheres

# load a catalog dataset
data_name = "adult"
dataset = OnlineCatalog(data_name)

# load artificial neural network from catalog
model = MLModelCatalog(dataset, "ann")

# get factuals from the data to generate counterfactual examples
factuals = dataset.raw.iloc[:10]

# load a recourse model and pass black box model
gs = GrowingSpheres(model)

# generate counterfactual examples
counterfactuals = gs.get_counterfactuals(factuals)

Contributing

Requirements

python3.7-venv (when not already shipped with python3.7)
Recommended: GNU Make

Installation

Using make:

make requirements

Using python directly or within activated virtual environment:

pip install -U pip setuptools wheel
pip install -e .

Testing

Using make:

make test

Using python directly or within activated virtual environment:

pip install -r requirements-dev.txt
python -m pytest test/*

Linting and Styling

We use pre-commit hooks within our build pipelines to enforce:

Python linting with flake8.
Python styling with black.

Install pre-commit with:

make install-dev

Using python directly or within activated virtual environment:

pip install -r requirements-dev.txt
pre-commit install

Licence

carla is under the MIT Licence. See the LICENCE for more details.

Citation

This project was recently accepted to NeurIPS 2021 (Benchmark & Data Sets Track). If you use this codebase, please cite:

@misc{pawelczyk2021carla,
      title={CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms},
      author={Martin Pawelczyk and Sascha Bielawski and Johannes van den Heuvel and Tobias Richter and Gjergji Kasneci},
      year={2021},
      eprint={2108.00783},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Please also cite the original authors' work.

carla's People

Contributors

Stargazers

Watchers

Forkers

philoso-fish stjordanis afiqmuzaffar zeyefkey arnabkar sandy4321 simrit1 jianshijim waalbukhanajer personx000 jaedukseo mmovin sci-m-wang guidj fern001 johanvandenheuvel jayanthyetukuri gibbsg martinpawel 416014530 aredelmeier kdis-ds drobiu ai4verification oort77 ruchirk wei2624 mohamadmansourx pat-alt ah-ansari fwallyn voulgaris-sot xiemeigongzi shubhaguha grahams-uncle staeltchinda david0tt halicia theblackcoathunt augustebaum explainable-ml smkia istepka samsepiol1 geektoni jaalu maartenroest1 slbinilkumar astefano itwizalekh bogdan-kulynych xueyagaga syyunn 1058433796 twi09 xijunke cwl02716 teddyzander a4bello

carla's Issues

Migrate GS evaluation

Wrapper class/method around gs
Run gs & evaluate from entry point of application

yNN computation

Hi!
Just a quick clarification question about the code used to calculate yNN. I noticed that the the labels of the closest neighbours (neighbour_label) in nearest_neighbours.py is compared to something called 'cf_label' which is defined as

cf_label = row[mlmodel.data.target]

If I'm not mistaken, cf_label is then the TRUE label of the test observation (not the predicted label).

This seems to be different from what is written under 4.2 "yNN" in the arXiv paper. Do you think it should be the predicted label and not the TRUE label that should be compared instead?

Thanks for your help.

Migrate adult dataset

Data loading (local/server/s3)
Data preprocessing

Refactor CEM to make it more readable

Migrate ANN predictions (H-)

Loading ML model from catalog if directory already exists

For example, if someone loads first the tensorflow ann model into cache at '../Users/xxx/carla/models/ann/adult/ann.h5' then it is not possible to load the pytorch model as ann.pt into the same directory.

Even the directory already exists, the if-condition
if not os.path.exists(cache_path):
would not recognize it, since the new file ann.pt is not contained. Creating the path would lead to an error.

Loading Tensorflow model with h5py > 2.1.0

An update in the h5py package causes errors in model loading for Tensorflow architectures (Link)

To successfully load our tensrflow model we need h5py version 2.10.0

Refactor D2 and D3 distances

We want the functions to be immutable & deterministc and not enforce the caller to have to use dataframes.

Calculate the range before calling the d2 and d3 distances and pass as parameter
Don't remove target labels within distance function. These functions should just calculate distances between any two lists/arrays
Write unit tests to test with empty and 1-element inputs

Migrate evaluation metrics

Evaluation into own module separate from main program
Write simplistic unit test

CLUE L0 distance of 18 (num features is 14)

The L0 distance for CLUE is computed with a factual and counterfactual list of 21 entries, rather then 14. Probably the one_hot_encoding that causes this.

Put CARLA on PyPi

Remove all print statements and replace with logger
Update setup.py with meta information
Create setup.cfg that links README
Add __version__ to package
Publish as package to github
Upload to PyPi
Put last 2 steps into CI/CD pipeline
Optional: Automatically increase version

Also check out this guide: https://realpython.com/pypi-publish-python-package/

The mlmodels are using Softmax with BCELoss.

It's probably better to either use Sigmoid with BCELoss or Softmax with CELoss.

Binary columns in CEM

In the get_counterfactuals method in CEM binary_cols is used. Is this the same as data.categoricals?

Also there is also a method map_binary_backto_string, this is implemented differently in the new api right?

https://github.com/indyfree/CARLA/blob/afb3cef7b3d412cfa40780e12c114be062352fb3/carla/recourse_methods/catalog/cem/cem.py#L567

Improve Face method

Improve the graph search method of Face, such that it can deal with continuous features.

However, authors also do not mention how to go about it

Migrating recourse methods: Actionabel Recourse

Implement the AR method similar to the implementation at https://github.com/Philoso-Fish/Benchmarkin_Counterfactual_Examples
Add extensive test cases

Migrating measurements

Implement the remaining measurements used in https://github.com/Philoso-Fish/Benchmarkin_Counterfactual_Examples

Questions about parameters used in CARLA paper

Thank you for all the work you put into the CARLA package. It's a great help when trying to compare different counterfactual methods! My questions pertain the parameters/specifications used in the CARLA paper.

What was the train/test split for the ann model? Do you have a way to access the train/test data in the package so that I can fit my own model with the same data?
In the paper, the adult data set seems to fix the features age, sex, and race. However, in the dataset class, Datacatalog("adult").immutables gives only "age" and "sex" as the immutables. Can you comment on whether there was a change since the simulations in the paper were run or whether the default in the package is wrong?
I am getting an error when running the cem/cem-vae models. When running:

mlmodel = MLModelCatalog(dataset, "ann", backend)
CEM(factuals, mlmodel, hyperparams)

where hyperparams comes from the experimental_setup.yaml script, I get the following error:

ValueError: For hidden_layer is no default value defined, please pass this key and its value in hyperparams

Changing the hyperparameters "ae_params" dictionary to:
hyperparams["ae_params"] = {'hidden_layer': [20, 10, 7], 'train_ae': True, 'epochs': 5}

does the trick. Can you let me know if this is the same hidden_layer dimension used in the paper?

The give_me_some_credit data set models the response "SeriousDlqin2yrs" which is a negative response. However, the predict_negative_instances() function returns factuals as those with a predicted probability of less than 0.5 (specifically 1675 rows). I would argue that these are the positive instances and the negative instances have a probability of lower than 0.5. Do you have any comment about this?
Finally, are the results in Table 2 of the paper based on all factuals with a predicted probability less than 0.5 (i.e., 1675 rows for give_me_some_credit and 39954 rows for adult)?

Thanks for all of your help!
Annabelle

Add save path for experiment output

Until now, every call of run_experiment saves its results in cache. It would be good if we could define an argpars for individual save paths.

Use typehinting

Use typehinting for method parameter
Add automatic checks for typehinting in workflow

Need auto-encoder model

Missing tags in the examples

In https://carla-counterfactual-and-recourse-library.readthedocs.io/en/latest/examples.html there are @property tags missing.
E.g.

     def feature_input_order(self):
         return [...]

should be

    @property
    def feature_input_order(self):
        return [...]

Loading Pytorch model

It is currently not possible to load a pytorch model which was saved in the banchmarking repsitory.

To load such a model it is necessary to have access to the original class, as discussed here and here.

Loading it without access to the class causes the error ModuleNotFound Error: No module named 'ML_Model'

MLModelCatalog predict method incompatible with pipeline

The MLModelCatalog predict method has the following signature:

def predict(
    self, x: Union[np.ndarray, pd.DataFrame, torch.Tensor, tf.Tensor]
) -> Union[np.ndarray, pd.DataFrame, torch.Tensor, tf.Tensor]:

however if the MLModelCatalog pipeline is enabled then x is also input for

def perform_pipeline(self, df: pd.DataFrame) -> pd.DataFrame:

i.e. the predict function can take input types that are incompatible with the possible model settings.

Make Actionable Recourse ready for linear model

AR for linear models does not need LIME. Coefficients and Intercepts are passed through parameter.

Refactor hyperparams parameter for recourse methods

The hyperparams parameter needs a generalized method to check if every important key is inside the input, and eventually assign default values.

Write AE model structure and training

Keep the model structure and training of AE inside CARLA and train a model, if it is needed inside a recourse method. After Training, the model is saved in our cache and can be loaded from there, if the recourse method is called another time.

The training doesn't take long, so we don't need to keep trained models in a repository.

In predict_label normalizing can cause problems if model.use_pipeline is false

If a mlmodel is trained with use_pipeline = False then normalizing when calling predict_negative_instances seems to result in an error.

Migrating recourse methods: CEM

Implement the CEM method similar to the implementation at https://github.com/Philoso-Fish/Benchmarkin_Counterfactual_Examples
There are two variations of CEM to implement
- cem
- cem-vae
Add extensive test cases

The original Github repository can either be found asking Martin or in the old repository at CF_Models/cem_ml

Migrating recourse methods: CLUE

Implement the CLUE method similar to the implementation at https://github.com/Philoso-Fish/Benchmarkin_Counterfactual_Examples
Add extensive test cases

The original Github repository can either be found asking Martin or in the old repository at CF_Models/clue_ml.

CLUE needs a pytorch model, maybe solving issue #16 is crucial for this.

Add documentation

Construct a documentation page with our docstring via Sphinx and integrate it via Github Pages

Add Dice-VAE

Expand Dice model to work with VAE

ML Model Catalog yaml

Similar to data catalog, use a yaml file to configure feature order

Basis project setup

Basic folder structure
Pre-commit
Makefile
Github pipeline linting

Would NAMs be another interesting model to add?

We recently published Neural Additive Models (NAMs) at NeurIPS 2021 which combines the interpretability of generalized additive models with neural nets (see the architecture shown below). We also evaluated NAMs on some of the dataset here including COMPAS and adult. Do you think they'd make for a good addition in this repo too?

The source code for NAMs is open-sourced in tensorflow (official version) and pytorch!
For a quick summary of NAMs, look at this thread.

Automate docs building process

With this one: #81 we have documentation hosted on read-the-docs, now we want an automated build that pushes to it. According to @Philoso-Fish requires a github webhook that can be retrieved via the admin interface

predict_negative_instances normalization can result in double normalization if model use_pipeline is True

predict_negative_instances calls 'predict_label' which normalized the input, and calls 'model.predict'. If 'model.use_pipeline==True' then 'model.predict' again normalizes the data, resulting in double normalization.

Restructure data api/ catalog

Given the problem between correct normalization, encoding, and feature order, which is specific to a certain, arbitrary black-box model, we need to have a setter method for class properties encoded, normalized, and encoded_normalized.

This method can use the black-box model and its pipeline as input to build the required dataframes.

Migrating recourse methods: DICE

Implement the DICE method similar to the implementation at https://github.com/Philoso-Fish/Benchmarkin_Counterfactual_Examples
Add extensive test cases

Erase nan-values in adult

The current adult dataset contains still nan values.
Fix preprocessing and lose unknown data.

Fix the documentation

Some documentation left from the old function parameters

https://github.com/indyfree/CARLA/blob/406d5f03a16edb8c03d90ba926f5a7b583fe4603/carla/recourse_methods/catalog/dice/model.py#L46

Add linear TF model

Add trained linear model
Write tests

Add Command Line Interface with common functionality

Fix output for CEM

The current output of CEM will not work with the benchmarking process.

Refactor output of CEM
Integrate in benchmarking process

Autoencoder training for CEM

The current implementation of CEM does not allow to train an autoencoder while calling it.
This leads to errors when no pretrained autoencoder is available.

Add training inside constructor
Add training flag in hyperparameter
Add tests

Migrating recourse methods: FACE

Implement the FACE method similar to the implementation at https://github.com/Philoso-Fish/Benchmarkin_Counterfactual_Examples
There are two variations of FACE to implement
- face-eps
- face-knn
Add extensive test cases

The original Github repository can either be found asking Martin or in the old repository at CF_Models/face_ml

Inaccurate Bug in constraint_violation Check

Hi,

When I tried to use the benchmark to test the constraint violations, I find an inaccurate casting in violations.py Line 34. When casting a float to an int using astype, we can get some inaccurate results which will break the constraint_violation check. Instead, we should use round before typecasting.

For example:

df_decoded_cfs[model.data.continous] = pd.DataFrame.round(df_decoded_cfs[
    model.data.continous
]).astype(
    "int64"
)

Best practice for "Imputers"

What about imputers?

Do you have any recommendation or best practice for address them?

When they need to be applied for training data as well before to produce predictions.
When they need to be distinct for categorical and continuous values.

carla-recourse / carla Goto Github PK

carla's Introduction

CARLA - Counterfactual And Recourse Library

Notebooks / Examples

Available Datasets

Provided Machine Learning Models

Implemented Counterfactual methods

Installation

Requirements

Install via pip

Quickstart

Contributing

Requirements

Installation

Testing

Linting and Styling

Licence

Citation

carla's People

Contributors

Stargazers

Watchers

Forkers

carla's Issues

Recommend Projects

Recommend Topics

Recommend Org