A lot of outcomes are binary. Is it poissble to have shap value in the log-odds in for

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Can we have shap value in the log-odds for binary target? about shapviz HOT 11 CLOSED

ferrenlove commented on May 18, 2024 1

Can we have shap value in the log-odds for binary target?

from shapviz.

Comments (11)

actuarial-lonewolf commented on May 18, 2024 3

I stumbled into this thread, searching for a way to display a shapviz waterfall with probabilities.
I understand the concern about the misleading element.

@madprogramer ... it would be nice if you shared your solution.

After having spent a good hour on my end, I'm sharing the completed function that was hinted last year, as it might help others in the future.

# Gross approximation to transform shap values into binary probabilities
special_transform <- function(shp) {
  b <- get_baseline(shp)
  S <- get_shap_values(shp)
  X <- get_feature_values(shp)
  
  # calculate prediction:
  p <- exp(b + rowSums(S)) / ( 1 + exp(b + rowSums(S)) )
  
  # transforming the baseline and shap values:
  b_new <- exp(b) / (1 + exp(b))
  S_new <- S / rowSums(S)*(p - b_new)
  
  shapviz(S_new, X, b_new)
}

from shapviz.

mayer79 commented on May 18, 2024

Actually, the SHAP values returned by XGBoost or LightGBM are on the logit link scale.

library(shapviz)
library(xgboost)

X <- iris[, -1]

# Binary logistic regression with XGBoost
fit <- xgb.train(
  params = list(objective = "binary:logistic"), 
  data = xgb.DMatrix(data.matrix(X), label = iris[, 1] >= 5.8), 
  nrounds = 30
)

# On logit scale
shp <- shapviz(fit, X_pred = data.matrix(X), X = X)
sv_waterfall(shp, row_id = 66)

If you rather mean to switch to probabilities, I don't think it is possible without violating at least some of the Shapley fairness axioms (linearity). Still, I think we could add a utility function that would map the "shapviz" object approximately from logit to probability space, using the approach in the blog post you provided.

from shapviz.

ferrenlove commented on May 18, 2024

Thank you ! If you can add a link function it would be superb!
BTW, I used your script and it retuned a below pic:

For the The E(f(x)] = 0.116, f(x) = 2.7, I guess there are not probabilities and still need to convert , along with the numbers highlighted in yellow bar together.

from shapviz.

mayer79 commented on May 18, 2024

The values in the plot above are log-odds and they are on the logit scale. So I think you are interested in transforming them (somehow) via logistic functio (= inverse logit) to propabilities without badly violating SHAP properties. I will look into that in the next time!

from shapviz.

mayer79 commented on May 18, 2024

I looked into the matter. According to the proposed transformation, a jump from 0.5 to 0.59 would be as large as a jump from 0.9 to 0.99 on the probability scale. I currently don't see a situation where this makes sense. Thus I wont add this transformation to "shapviz". If you still need it, simply do the transformation based on

special_transform <- function(shp) {
  b <- get_baseline(shp)
  S <- get_shap_values(shp)
  X <- get_feature_values(shp)
  b_new <- g(b, S)
  S_new <- f(b, S)
  shapviz(S_new, X, b_new)
}

from shapviz.

ferrenlove commented on May 18, 2024

Thank you for writing this specical transform funtion for me. However, it gives me a error "Error in g(b, S) : could not find function "g"". Can you help with this?

from shapviz.

mayer79 commented on May 18, 2024

I leave this as an exercise to you. Based on your link, it is just 2-3 lines of code. But as I said, I can't think of a situation where the proposed mapping makes much sense. On the contrary, I think it is misleading.

from shapviz.

ferrenlove commented on May 18, 2024

Thank you! I ran the python codes to get an idea of what shap output array and baseline prediction looks like. I tried a few different codes and results turn out differntly. I will spend more time looking at it. Appreciate your help!

from shapviz.

madprogramer commented on May 18, 2024

@mayer79 I know the issue is closed, but I made a logit to binary linking function some time ago.

I still understand that you don't want to add it in though :<

from shapviz.

mayer79 commented on May 18, 2024

As far as I remember, any non-linear transform will violate the fairness axioms of Shapley, so I'd like to keep it outside the package.

Since the "shapviz" object contains the baseline b and all SHAP values S, you could write a blogpost or similar for those interested?

from shapviz.

mayer79 commented on May 18, 2024

@actuarial-lonewolf : Thanks for this input!

Comparing with Kernel SHAP (once on link scale and once on probability scale):

library(kernelshap)
library(shapviz)

fit <- glm(Treatment ~ Type + conc + uptake, data = CO2, family = binomial)
v <- c("Type", "conc", "uptake")
(link_scale <- shapviz(kernelshap(fit, X = CO2, bg_X = CO2, feature_names = v)))
(prob_scale <- shapviz(kernelshap(fit, X = CO2, bg_X = CO2, feature_names = v, type = "response")))

transformed <- special_transform(link_scale)

get_shap_values(prob_scale[1:5, ])
#           Type        conc      uptake
# [1,] 0.1945444 -0.12989845  0.34064436
# [2,] 0.1970985 -0.15394423 -0.08552033
# [3,] 0.1711899 -0.11740017 -0.23216460
# [4,] 0.1613382 -0.06415103 -0.30143362
# [5,] 0.1884975  0.02929410 -0.22343369

get_shap_values(transformed[1:5, ])
#           Type        conc     uptake
# [1,] 0.2197120 -0.19802628  0.3731001
# [2,] 0.3076373 -0.21203252 -0.1484753
# [3,] 0.2954062 -0.14487101 -0.3394145
# [4,] 0.2910848 -0.06558863 -0.4402471
# [5,] 0.3082047  0.05310590 -0.3774572

from shapviz.

Can we have shap value in the log-odds for binary target? about shapviz HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent