The interpretable-ml-book from christophm

5.3.2 Bike rentals (Classification) should be Bike rentals (Regression)

as title

Gitbook

Gitbook seems to now only have a legacy option. Did you use the editor to create this on your desktop ? Can you possibly recommend an easy way to get gitbook going. I love your work and have linked to it via one of my websites.

Possible mistakes in lime-text-explanations example.

Hai, thanks for the efforts on the interpretation of ML methods. In Ch5, the last example on lime-text-explanations, it seems that the generated image has both instance labeld as spam. At least in the HTML version. Some minor mistakes?

Significance of gbm interactions

In section 5.4.5 you say:
"It is unclear whether an interaction is significantly greater than 0. We would need to conduct a statistical test, but this test is not (yet) available in a model-agnostic version."
I know that Friedman and Propescu 2008 recommend a method to create a null distribution in order to determine the significance of interactions for a gbm model. Do you know if this method is implemented in R anywhere, even if just for gbm models?

Create better pictograms for explaining the Shapley value

They are hand drawn and I am not super happy with them.
Help to improve those is appreciated.

The location in the book: https://github.com/christophM/interpretable-ml-book/blob/master/chapters/05.6-agnostic-shapley.Rmd

List CART (decision tree) alternatives and software

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to papers and software implementations of decision tree algorithms as comments to this issue.

Figure text for figure 5.10 and 5.11 is the same.

I guess this text is for figure 5.10 and not figure 5.11?

FIGURE 5.11: The interaction strength for each feature with all other features for a random forest predicting the probability of cervical cancer. The number of diagnosed sexually transmitted diseases has the highest interaction effect with all other features, followed by the number of pregnancies.

Confusing difference for Figure 5.22

I guess it's because of rounding but at first I was a bit confused that the "difference" between the actual prediction (0.43) and the average prediction (0.03) is 0.41. I think it might confuse readers and could be explained in the text. If you agree I can add a sentence to explain it.

Cross-references

Hi!

Thanks for your effort with this manuscript.

I've noticed that in many places you use "in this section" or "as can be seen here" as cross-references. These work fine if the manuscript is read as an interactive document, i.e., freely on your web-site. However, if read in a printed PDF, these cross-references are not particularly useful.

I'll work on pull-request to fix these issues, but I thought you should be aware of them.

Cheers,
Isak

List alternatives and software for interaction effect measures

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to alternatives and software implementations of interaction effect measure algorithms as comments to this issue.

python version of scrpit

Hi, Chris

Do you have python version of your scripts for this book?

Thank you

List alternatives and software for feature importance measures

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to alternatives and software implementations of feature importances measure algorithms as comments to this issue.

Make the "next page >" symbol more obvious on the Preface page

On my cell phone the next page symbol ">" is a light grey and very small. Even though I had my cell read "Desktop site" no menu or side bar was visible. I almost ignored your book since I could only see the preface.

Suggest making a regular link on the preface page to the next page.

List linear regression variants and software

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to papers and software implementations of the linear regression model as comments to this issue.

Reference List

As a reader I would like to have all references in one place to look them up.

I stumbeled upon this when I read Chapter 2.1.
In the text it references Miller 2017, but this is not in the footnotes. Ultimately, I found it in the introduction of Chapter 2.
To prevent a search ike mine, it would be nice if all references could be found in a common place.

Need more clarity on 5.1.5.

I think this paragraph needs more clarity when explaining how the effects on each feature (for this instance) affect the prediction.

Most of the confusion come from expressions like "unusually little or much".

First and foremost, providing the used instance allows the reader to have a little bit more context.

More concepts that are not clear:

Is "Temperature (2 degrees)" the data from this instance, or meaning that two unit increases?
"contributes less towards the predicted value compared to the average" - what is the average? Average predictions? Mean value of distribution?
"“days_since_2011” unusually much, because this instance is from late 2011 (5 days)." - so the features contributes a lot because it is from late 2011 (only 5 days????). Shouldn't it be 300 days or so?

Thanks a lot for the book, it is very well written, but I do feel that this example needs a little bit more of step-by-step explanation.
e.g.: Providing the instance, showing how it maps to the plot (calculating effects), and cross-referencing to the weight table, in order to have a full grasp of what is going on.

P.S.: It would also be really nice to have the data with the extracted features in order to reproduce the work.

Feature importance disadvantages

I think there could be added some disadvantages in the feature importance chapter.

There is this paper: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-8-25

Moreover there is the problem of correlated variables. If two variables are highly correlated, their feature importance could be massively decreased, because one variable can substitute the other variable (for example when we grow a lot of trees). Here is a good blogpost about it, there are for sure some papers about this topic, I guess: https://freakonometrics.hypotheses.org/20545

Btw. you should cite Breiman, 2001, not 2011, as he already died in 2005. ;)

Model-agnostic Permutation Feature Importance

Hi Christoph, and thank you for the great book!

In the Permutation Feature Importance section you say:

I haven’t found any paper that generalises permutation feature importance, so that it can be applied model-agnostic. Please drop me a mail if you know about model-agnostic feature importance.

But you propose a model-agnostic version of the algorithm in the same section. Did you forget to remove that couple of sentences? Or am I missing something?

Labels on Figure 6.9

The labels are inconsistent with the description in the preceding paragraph.

render_book error: Cannot open file '10-references.Rmd'

I'm progressing through your README file and upon executing line 34:

bookdown::render_book('', 'bookdown::gitbook')

I receive the following error:

Error` in file(con, "r") : cannot open the connection In addition: Warning message: In file(con, "r") : cannot open file '10-references.Rmd': No such file or directory

I checked the 'manuscript' folder for the following file: '10-references.Rmd' but did not see it in the folder.

Thank you for your time!

Can't see whole table for 4.3.2.

Here is an image

This leads me to believe that the table is incomplete.

I'm using Chrome on a Mac. Also tested on Brave browser (firefox based I believe).
Maybe table is just too long!

Cheers

List local and global surrogate models (like LIME or tree surrogates) alternatives and software

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to papers and software implementations of local and global surrogate models (like LIME or tree surrogates) as comments to this issue.

Additions

Brilliant work.

Adding LIME advantages and disadvantages.

Explaining what is meant by sparse explanations - Shapley Value section.

Looking forward to your future work.

List alternatives and software for partial dependence plots

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to alternatives and software implementations of partial dependence plots algorithms as comments to this issue.

explanation on how weights calculated in 5.7.2.1 Example

Very good book, thanks for writing it! :)

Just one issue: would you please give us some explanation on how the weights were calculated in the 5.7.2.1 Example. (second table, last column, after the probs). I understand it is a distance between the generated sample text and the original sentence. Could you please give us some hints on how the distance between sentence (or set of words) were calculated?

Sorry, this issue is addressed somewhere.

Prototypes/criticisms : add disadvantage

I fail to see the difference between the two concepts. As per your example we can see that in the top right figure with 0.316 MMD, the middle point is considered a prototype. But you selected another configuration and the central blob ends as a criticism. Shouldn't be the conclusion to add another prototype ? It appears to me that criticisms are very dependant on the prototypes selected. Isn't that dangerous for interpretability to use two opposite concepts when blobs of data can be one or the other depending on an arbitrary cut off value ?

List shapley value alternatives and software

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to papers and software implementations of the shapley value for machine learning as comments to this issue.

Create a pretty version of tree-to-rules graphic

The graphic that shows how a tree can be turned into decision rules is currently drawn by hand.
It would be great to have a pretty version of it, created on a computer.
The format can be SVG (preferred), png or jpg.
The filename is: rulefit.jpg

Add confidence interval formula to linear model chapter

Linear Regression :
A formula for the confidence interval of estimated weights could be useful, especially to illustrate the dependency of its lenght to the number of instances.

Add GAMs to chapter Interpretable Models

Hi Christoph,

First of all a big bravo and thank you for writing such an interesting book and making it open to everyone!

I would suggest adding a section on generalized additive models, although not very popular in the data science community, these methods are very powerful and simple for balancing accuracy vs. interpretability.

You probably saw the Caruana talks abou GAMS for HealthCare, in any case here goes the link: https://www.microsoft.com/en-us/research/video/intelligible-machine-learning-models-for-healthcare/

Cheers,

Price prediction is 310.000 instead of 301.000 in Figure 5.20

The price is correct in figure text and also in the figure in the pdf version, but it is incorrect in online version.

Write a chapter about your favorite deep learning interpretability method

There are many interpretability methods for deep neural networks.
The field is very young and most methods are quite new and under development. It's hard for a single person to keep track. This issue serves as a placeholder for interpretability methods specific for deep learning.

Leave a comment if you are interested in adding a chapter. Any method that helps interpreting neural networks is interesting. A good starting point is to take a paper or some code library and explain it in simpler words, possibly with an example.

Some possible starting points:

Software:

Scientific Paper:

Leave a comment if you are interested in writing a chapter!

Decision Trees Chapter

On 4.4.2. on the example you talk about purity (of Gini index).
But you don't reference it anywhere else, which might add to the confusion of the reader.

I have some background so I know what you are talking about, but others will not!

So, either reference a link to Gini Index and what is this purity,
or just give the intuition that purity is relative to the subsets, and can be dirty if it has instances that do not "belong" to that group.

As always, keep up the good work,
Pedro

nonconsistent value in caption and chart

Hi,
Caption to this chart:

interpretable-ml-book/manuscript/05.3-agnostic-ice.Rmd

Line 91 in 062d1e2

    
           ```{r ice-cervical-centered, fig.cap=sprintf("Centered ICE plot for predicted  cancer probability by age. Lines are fixed to 0 at age %i. Compared to age %i, the predictions for most women remain unchanged until the age of 45 where the predicted probability increases.", min(cervical_subset$Age), min(cervical_subset$Age))}

seems to be ok ('Lines are fixed to 0 at age 13. Compared to age 13'), but it is inconsistent with y-axis label: 'Cancer probability difference to age 18'.

PDF version

Hi Christoph,

lovely, thanks a lot for the book.
I might sound semi-old asking for the possibility to build a PDF version of your book, but in order to read and make notes on e.g. an iPad or other tablet this would make a lot of sense to me.

Is there any way to do so?

epub version

An ePub version could be easily supported by adding the option to the _output.yml, e.g.:

bookdown::epub_book:
  dev: svglite

An example can be seen here: https://github.com/rstudio/bookdown/blob/master/inst/examples/_output.yml
Rendering seems to work fine with standard settings.

List decision rule algorithm alternatives and software

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to papers and software implementations of decision rule algorithms as comments to this issue.

Unclear formulation: sensitive M

The explanation of the Monte-Carlo approximation of the shapely value says "It is unclear how to choose a sensitive M".

I'm not entirely sure what you mean by sensitive here.

Semi-related question: couldn't the correlation problem be solved by running a PCA first? (orthogonalize to the feature of interest, then run PCA on the remaining columns).

List logistic regression alternatives and software

I am extending all chapters with a section for software implementations and to alternative algorithms (also with software implementation). The software can be any free and open source software: R, Python, Weka, ...

You can help out by posting links to papers and software implementations of logistic regresision algorithms as comments to this issue.

Friedman’s H-statistic formula

In https://christophm.github.io/interpretable-ml-book/interaction.html#theory-friedmans-h-statistic

The two formula's under

In mathematical terms, the H-statistic for the interaction between feature [...] proposed by Friedman and Popescu is:

Should have a ^2 (i.e. be squared) added around the square brackets in de numerator as well, when comparing to the paper, I believe.

"Improve" is written twice in the last part of 6.2

Empty _book directory after build

After installing all packages and running all commands recommended in README file I have no error messages in console, but no resulting HTML files in _book directory.

What can you recommend to look at (for R beginner)?

Add acknowledgements

Chapter contributors (Abi, Verena)
smaller fixes (go through PRs)
Cover (Yvonne)
Images (Shapley: Abi, flaticon; future: this japanese website)
Funding from ZD.B
Early readers

Fix centering of ice plots

should all be centered at x = 0.

Maybe the problem is with the iml package.

Error in the RuleFit -> Guidelines -> Disadvantages

In the Disadvantages section of the RuleFit Guidelines you write:

"For example one decision rule (feature) for the bike prediction could be: “temp > 15” and another rule could be “temp > 10 & weather=‘GOOD’”. When the weather is good and the temperature is above 10 degrees, the temperature is automatically also always bigger then 15, which means in the cases where the second rule applies, the first one also always applies."

I think that you have swapped the numbers around and the rules should be:

temp > 10, and
temp > 15 & weather=‘GOOD’.

temp > 10 & weather=‘GOOD’ does not imply temp > 15, e.g. weather=‘GOOD’ & temp = 13.

Some pointers:

Leave a comment if you are interested in writing a chapter about a tree ensemble-specific method!

Add subset (m) to notation in Shapley value chapter

$x^m_{∗+j}$ instead of $x^{∗+j}$
and $x^m_{∗-j}$ instead of $x^{∗-j}$

Write a paragraph on how Shapely fairs in image datasets.

Write a paragraph on how Shapely fairs in images. See Page 196 of Image Analysis and Recognition: 12th International Conference, ICIAR 2015 for an example

christophm / interpretable-ml-book Goto Github PK

interpretable-ml-book's People

Contributors

Stargazers

Watchers

Forkers

interpretable-ml-book's Issues

Recommend Projects

Recommend Topics

Recommend Org