Coder Social home page Coder Social logo

fat-forensics / fat-forensics Goto Github PK

View Code? Open in Web Editor NEW
70.0 6.0 15.0 1.99 MB

Modular Python Toolbox for Fairness, Accountability and Transparency Forensics

Home Page: https://fat-forensics.org

License: BSD 3-Clause "New" or "Revised" License

Python 99.53% Makefile 0.33% Shell 0.15%
machine-learning fairness accountability transparency interpretability explainability explainable-ai interpretable-ai

fat-forensics's Introduction

Software Licence_ GitHubRelease_ PyPi_ Python35_
Docs Homepage_
CI GitHubTests_ GitHubDocs_ Codecov_
Try it Binder_
Contact MailingList_ Gitter_
Cite BibTeX_ JOSS_ ZENODO_

FAT Forensics: Algorithmic Fairness, Accountability and Transparency Toolbox

FAT Forensics (fatf) is a Python toolbox for evaluating fairness, accountability and transparency of predictive systems. It is built on top of SciPy and NumPy, and is distributed under the 3-Clause BSD license (new BSD).

FAT Forensics implements the state of the art fairness, accountability and transparency (FAT) algorithms for the three main components of any data modelling pipeline: data (raw data and features), predictive models and model predictions. We envisage two main use cases for the package, each supported by distinct features implemented to support it: an interactive research mode aimed at researchers who may want to use it for an exploratory analysis and a deployment mode aimed at practitioners who may want to use it for monitoring FAT aspects of a predictive system.

Please visit the project's web site https://fat-forensics.org for more details.

Installation

Dependencies

FAT Forensics requires Python 3.5 or higher and the following dependencies:

Package Version
NumPy >=1.10.0
SciPy >=0.13.3

In addition, some of the modules require optional dependencies:

+--------------------------------------------------------------+------------------+------------+ | fatf module | Package | Version | +==============================================================+==================+============+ | fatf.transparency.predictions.surrogate_explainers | | | +--------------------------------------------------------------+ | | | fatf.transparency.predictions.surrogate_image_explainers | | | +--------------------------------------------------------------+ | | | fatf.transparency.sklearn | scikit-learn | >=0.19.2 | +--------------------------------------------------------------+ | | | fatf.utils.data.feature_selection.sklearn | | | +--------------------------------------------------------------+------------------+------------+ | fatf.transparency.predictions.surrogate_image_explainers | | | +--------------------------------------------------------------+ | | | fatf.utils.data.occlusion | scikit-image | >=0.17.0 | +--------------------------------------------------------------+ | | | fatf.utils.data.segmentation | | | +--------------------------------------------------------------+------------------+------------+ | fatf.transparency.predictions.surrogate_image_explainers | | | +--------------------------------------------------------------+ | | | fatf.utils.data.occlusion | Pillow | >=8.4.0 | +--------------------------------------------------------------+ | | | fatf.utils.data.segmentation | | | +--------------------------------------------------------------+------------------+------------+ | fatf.vis | matplotlib | >=3.0.0 | +--------------------------------------------------------------+------------------+------------+

User Installation

The easies way to install FAT Forensics is via pip:

pip install fat-forensics

which will only installed the required dependencies. If you want to install the package together with all the auxiliary dependencies please consider using the [all] option:

pip install fat-forensics[all]

The documentation provides more detailed installation instructions.

Changelog

See the changelog for a development history and project milestones.

Development

We welcome new contributors of all experience levels. The Development Guide has detailed information about contributing code, documentation, tests and more. Some basic development instructions are included below.

Source Code

You can check out the latest FAT Forensics source code via git with the command:

git clone https://github.com/fat-forensics/fat-forensics.git

Contributing

To learn more about contributing to FAT Forensics, please see our Contributing Guide.

Testing

You can launch the test suite from the root directory of this repository with:

make test-with-code-coverage

To run the tests you will need to have version 3.9.1 of pytest installed. This package, together with other development dependencies, can be also installed with:

pip install -r requirements-dev.txt

or with:

pip install fat-forensics[dev]

See the Testing section of the Development Guide page for more information.

Please note that the make test-with-code-coverage command will test the version of the package in the local fatf directory and not the one installed since the pytest command is preceded by PYTHONPATH=./. If you want to test the installed version, consider using the command from the Makefile without the PYTHONPATH variable.

To control the randomness during the tests the Makefile sets the random seed to 42 by preceding each test command with FATF_SEED=42, which sets the environment variable responsible for that. More information about the setup of the Testing Environment is available on the development web page in the documentation.

Submitting a Pull Request

Before opening a Pull Request, please have a look at the Contributing page to make sure that your code complies with our guidelines.

Help and Support

For help please have a look at our documentation web page, especially the Getting Started page.

Communication

You can reach out to us at:

  • our gitter channel for code-related development discussion; and
  • our mailing list for discussion about the project's future and the direction of the development.

More information about the communication can be found in our documentation on the main page and on the develop page.

Citation

If you use FAT Forensics in a scientific publication, we would appreciate citations! Information on how to cite use is available on the citation web page in our documentation.

Acknowledgements

This project is the result of a collaborative research agreement between Thales and the University of Bristol with the initial funding provided by Thales.

fat-forensics's People

Contributors

alexhepburn avatar mattclifford1 avatar rafaelpo avatar so-cool avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fat-forensics's Issues

Error when trying to explaining a class with index 0

Please check that you have already viewed existing issues and that the bug you are reporting is new.

Python version

  • 2.x
  • 3.6.x
  • 3.7.x

Package versions

  • fat-forensics: 0.1.0
  • numpy: 1.19.2
  • scipy: 1.5.2
  • sciikt-learn: 0.23.2

Description

When using a surrogate explainer, in the explain_instance function, if you try to explain a class with index 0 or the class name you want to explain leads to a class index of 0, this assertion fails

assert explained_class_index, 'Explain a single class.'

Using explained_class parameter that leads to a class index of 0 will reproduce this error.

Steps to reproduce the bug

  1. Initialise any explainer
  2. Try to explain an instance, using explained_class as 0 or the 0th index in the explainers class_names attribute.

Logs

AssertionError: Explain a single class.

Source Code

explainer.explain_instance(explanation point, explained_class=0)

Explaining a class where the fit function takes one parameter fails.

Please check that you have already viewed existing issues and that the bug you are reporting is new.

Python version

  • 2.x
  • 3.6.x
  • 3.7.x

Package versions

  • fat-forensics: 0.1.0
  • numpy: 1.19.2
  • scipy: 1.5.2
  • sciikt-learn: 0.23.2

Description

The surrogate explainer classes check if the model passed is valid, and part of this check is if the fit method takes in 2 parameters. In the case of explaining a clustering algorithm, if as_probabilistic is True, then you do not need the predict_proba method. However, the model validation still fails as the fit method for clustering algorithms usually only take one parameter.

The relevant code can be found in fatf.utils.models.validation.check_model_functionality.

Steps to reproduce the bug

  1. Load in data
  2. Train a clustering algorithm (like KMeans) using the data
  3. Try to use a surrogate explainer to explain the clusters

Logs

IncompatibleModelError: With as_predictive set to False the predictive model needs to be capable of outputting (class) predictions via a *predict* method, which takes exactly one required parameter -- data to be predicted -- and outputs a 1-dimensional array with (class) predictions.

Source Code

import fatf.utils.data.datasets as fatf_datasets
import fatf.transparency.predictions.surrogate_explainers \
    as fatf_surrogate_explainers
from sklearn.cluster import KMeans

iris_data_dict = fatf_datasets.load_iris()
kmeans = KMeans(n_clusters=3)
kmeans.fit(iris_data_dict['data']) # This fit function only takes in 1 parameter `x`
explainer = fatf_surrogate_explainers.TabularBlimeyTree(
    iris_data_dict['data'],
    kmeans,
    as_probabilistic=False)

fatf.utils.data.occlusion.Occlusion.occlude_segments_vectorised removes a dimension for input arrays of shape (1, n)

Please check that you have already viewed existing issues and that the bug you are reporting is new.

Python version

  • 2.x
  • 3.6.x
  • 3.7.11

Package versions

  • fat-forensics: 0.1.1
  • numpy: 1.17.3
  • scipy: 1.4.1

Description

fatf.utils.data.occlusion.Occlusion.occlude_segments_vectorised does not differentiate between an input array of shape (1, n) and (n, ). This is caused by line 648.

Source Code

import fatf.utils.data.occlusion as fatf_occlusion
occluder = fatf_occlusion.Occlusion()
occluder.occlude_segments_vectorised([[1, 1, 1]])
# The output should be of shape (1, ...)

relatively slow runtime to find counterfactual explanations

Hi!

I am trying to use FatF's CounterfactualExplainer on the Boston housing data set (among others) for a comparison with other methods.

I set up FatF as you can see here:
https://github.com/marcovirgolin/robust-counterfactuals/blob/dev/robust_cfe/wrappers.py#L201

if you wish to reproduce exactly what I am doing, you can install my repo (branch dev) and use this file:
https://github.com/marcovirgolin/robust-counterfactuals/blob/dev/quicktest.sh
(replacing dice-genetic with fatf)

Some further details: I am passing as ML model a random forest which includes a one-hot encoding pre-processing step whenever .predict is called.

It seems to me it can take a relatively long time (I tried 30 min) to generate counterfactuals. I read that FatF uses grid search.
I guess this means that the search space is sliced into a large number of Cartesian possibilities, and each and every one of them is tried. I guess I could only achieve fast runtime if I use a very coarse grid.
Am I right about the grid search (and thus why it could be slow), or am I missing something? Is there a max. runtime setting I am missing?

Thank you for your help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.