Coder Social home page Coder Social logo

hon-gyu / model-fingerprint Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 296 KB

A model-agnostic method to decompose predictions of machine learning models into linear, nonlinear and pairwise interaction effects.

License: MIT License

Jupyter Notebook 90.94% Python 9.06%

model-fingerprint's Introduction

Model Fingerprint

Introduction

A model-agnostic method to decompose predictions into linear, nonlinear and pairwise interaction effects. It can be helpful in feature selection and model interpretation.

The algorithm is based on paper Beyond the Black Box: An Intuitive Approach to Investment Prediction with Machine Learning. The following images show the results of model fingerprint algorithm that can break down the predictions of any machine learning model into comprehensible parts.

image image

There are 3 major benefits of model fingerprint algorithm:

  • Model fingerprint is a model-agnostic method, which could be applied on top of any machine learning models.
  • The resulting decompositions of effects are highly intuitive. We can easily understand which feature has greater influence compared to others, and in what form (linear, nonlinear and interactions).
  • The units of all three effects are common, which is also the same as the unit of response variable that is predicted. This makes it even more intuitive.

Partial Dependence Function

The model fingerprint algorithm extends the partial dependence function.

Calculation of Partial Dependence Function

The partial dependence function assesses the marginal effect of a feature by following these steps:

  1. Changing the value of the selected feature across its full range (or a representative set of values).
  2. For each value of the selected feature, the model predicts outcomes for all instances in the dataset. Here, the varied feature is assumed to be the same for every instance, while other features are kept at their original values.
  3. The average of these predictions gives us the partial dependence for a value of the chosen feature.

This partial dependence can be understood as the expected prediction of the model as a function of the feature of interest. The same process will be performed for all features.

Two Remarks on Partial Dependence Function

  1. The partial dependence function will vary little if the selected feature has little influence on the prediction.
  2. If the influence of a certain feature on the prediction is purly linear, then its partial dependence plot will be a straight line. For example, for ordinal regression model, the partial dependence plots for all features will be a straight line whose slope equals to the coefficient.

Decompostion of Predictions

The model fingerprint algorithm decomponse the partial dependence function of a certain feature into a linear part and a nonlinear part. It fits a linear regression model for the partial dependence function. We denote the fitted regression line as $l_k$ and partial dependence function as $f_k$ for feature $k$.

Linear Effect

The linear prediction effect of a feature is defined by the mean absolute deviation of the linear predicitons around their average value:

$$\text{Linear Effect of } x_k = \frac{1}{N} \sum_{i=1}^{N}abs(l_{k}(x_{k, i}) - \frac{1}{N}\sum_{j=1}^{N}f_k (x_{k, j}))$$

Nonlinear Effect

The nonlinear prediction effect is defined by the mean absolute deviation of the total marginal effect around its corresponding linear effect:

$$\text{Nonlinear Effect of } x_k = \frac{1}{N} \sum_{i=1}^{N}abs(f_k(x_{k, i}) - l_k (x_{k, j}))$$

Here is a plot demonstrating the partial dependence function, fitted regression line, linear component, and nonlinear component.

image(Li, Y., Turkington, D. and Yazdani, A., 2020)

Pairwise Interaction Effect

The calculation of pairwise interaction prediction effects is similar but this time a joint partial dependence function for both features are calculated, denoted as $f_{k, l}$. Following a similar line of thought of H-statistics, the pairwise interaction effect is defined by the de-meaned joint partial prediction minus the individual de-meanded partial predictions:

$$\text{Pairwise Interaction Effect of } (x_k, x_l) = \frac{1}{N^2} \sum^{N}_ {i=1} \sum^{N}_ {j=1} abs(f_{k, l} (x_{k, i}, x_{l, j}) - f_k (x_{k, i}) - f_l (x_{l, j}))$$

Drawback

  • The pairwise combinations of features need to be manually assigned and too many combinations can lead to slow computation.
  • Higher order interactions between features are not demonstrated.
  • The sign (direction) of feature influence is not shown.

Future TODO

  • compatibility with classifier

model-fingerprint's People

Contributors

hon-gyu avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.