Coder Social home page Coder Social logo

interpretml / interpret Goto Github PK

View Code? Open in Web Editor NEW
6.1K 143.0 716.0 14.1 MB

Fit interpretable models. Explain blackbox machine learning.

Home Page: https://interpret.ml/docs

License: MIT License

Batchfile 0.20% Shell 2.15% C++ 59.94% Python 33.65% CSS 0.45% JavaScript 0.28% Jupyter Notebook 0.27% C 1.27% R 0.79% Makefile 0.13% SCSS 0.04% Cuda 0.52% Dockerfile 0.02% TypeScript 0.29%
machine-learning interpretability gradient-boosting blackbox scikit-learn xai interpretml interpretable-machine-learning interpretable-ai transparency

interpret's Issues

Feature importance with interactions

Thank you for giving a detailed explanation on feature importance on this page (#12). My question is what if the EBM model has interactions (say, I use the argument interactions = 10 when training the EBM model ExplainableBoostingRegressor, so it has 10 pairwise interactions).

In particular, for a given feature, if some interactions containing it are selected, would those be involved in the calculation of "how each train data point is scored by that feature"? Would this affect the global interpretation of the model, and/or the local interpretation?

Thank you.

EBM inside LightGBM?

Hi!

Maybe this is a silly question (mostly because codebases are obviously different), but what do you think about integrating this library with LightGBM, since it is also a top-notch gradient boosting library by Microsoft?

Now that the project is small, this should be easier than trying to do the same later. In the long term, maintenance costs should go down IMHO...

BTW great job with this library. I absolutely love your approach.

scoring on the continuous feature looks unreal

This graphic show the scoring on a continuous variable, contrary to the spline regression, the score has a lot of volatility, it might overfit the local observations. I tried to tweak the fitting parametters to have something smoother, but no success. What's the recommendation?

graphic

amazing work

(ok not an issue)
really :) had to properly register it
huge congrats

worked without issues on windows also

requirements.txt doesn't have all the requirements

The README says:

pip install numpy scipy pyscaffold
pip install -U interpret

I am confused why the project requirements.txt (or equivalent) doesn't just also have numpy, scipy, pyscaffold, if they are .. well, required.

If they're already in the environment, they won't be installed, so that's harmless.

I see in setup.py they are listed separately, with the comment:

  # NOTE: Numpy here is a workaround to skope-rules' dependencies.

so I assume you have your reasons?

Doing some treatment for missing values

Can we add some random values/treat them as ""null"" for missing data. So that we can avoid error in during data exploration using :
hist = ClassHistogram().explain_data(X_train, y_train, name = 'TrainData').

Show function while on the remote cloud

Sorry if i have any grammatical errors.(not a native speaker)

I've read this issue: #17

I think my problem is very similar, it might be some kind of server issues.

I have a virtual environment on amazon's aws ec2, opened jupyter notebook there, and connect the jupyter notebook on my local machine via ssh.

But when i use the show function(like the one in "Explaining Blackbox Classifiers.ipynb"), the plot will show "127.0.0.1 refused to connect".

Is there any way that i can still plot the interactive one on the cloud jupyter?
I can sucessfully plot an interactive one in local jupyter, but on the cloud it just fails.
Thank you for your attention!

"Error loading dependecies" in show() method

Hi guys, thanks for this great contribution.

Each time I use the method 'show()' I got the following error:
"Error loading dependencies"

Examples:

ebm_global = ebm.explain_global(name='EBM')
show(ebm_global) # "error loading dependencies

I'm using Python 3.7

Thanks in advance,

Nelson

Question on Calibration

I'm testing the classifier algorithm on a dataset (unfortunately confidential) with a binary target (70,000 rows and about 40 predictors) and seeing that while the rank ordering is competitive with other tree based methods, the predictions seem poorly calibrated - even on the training data itself. The prediction is always lower than the actual. I am wondering if there might be a cause based on the algorithm that could be tuned or if this has been seen in development?

The model is trained using the default settings (I have tweaked multiple parameters and not found any impact)

ebm = ExplainableBoostingClassifier(n_estimators=16,interactions=0,n_jobs=10)
ebm.fit(X,y)

The prediction is made on the training data.

p=ebm.predict_proba(X)[:,1]
print(np.mean(p))  # THIS IS 0.023

I rank the predictions into deciles (10% bins) and plot the actual target rate and the mean prediction probability for each decile. The rank order is good, AUC is high (this is the training data of course) but we underpredict systematically. The red horizontal line is the overall mean of the training data which is significantly higher than the mean prediction noted above (0.1 versus 0.02)

image

Using interpretML in a python script

Hi!
I was wondering if it's possible to use the library in a python script. I've tried to do that for shap plots expecting for interpretML to produce a link to a dashboard:

shap = ShapKernel(predict_fn=bbc.predict, data=background_val, feature_names=feature_names) shap_local = shap.explain_local(most_probable.drop(['prediction','probabiltiy','target'],axis=1), most_probable.target, name='SHAP') show([shap_local])

The last line produces:
{'text/html': '<!-- http://127.0.0.1:7974/140466213906184/ -->\n<a href="http://127.0.0.1:7974/140466213906184/" target="_new">Open in new window</a><iframe src="http://127.0.0.1:7974/140466213906184/" width=100% height=800 frameBorder="0"></iframe>'} as in jupyter notebook.

However, when I open the link it says this site can't be reached.
Is there a way to call dashboards from the console or the library is to be used exclusively in jupyter?
Thanks!

Required by libstdc++.so.6

Found Error:
libstdc++.so.6: version CXXABI_1.3.8

I got libstdc++.so.6 but do not support CXXABI_1.3.8, as below
`CXXABI_1.3

CXXABI_1.3.1
CXXABI_1.3.2
CXXABI_1.3.3
CXXABI_1.3.4
CXXABI_1.3.5
CXXABI_1.3.6
CXXABI_1.3.7
CXXABI_TM_1
`

when try the base example:

 from interpret.glassbox import ExplainableBoostingClassifier
 ebm = ExplainableBoostingClassifier()
 ebm.fit(X_train, y_train)

My Server:
Centos7.5 ( Aliyun, Qcloud both tried)
gcc/g++ 4.8.5
libstdc++.so.6.0.19

Solution:
Copy from somewhere a libstdc++.so.6.0.24 , an make a sofelink of libstdc++.so.6

Of course it‘s not problem of interpretML,but still a trouble for newbe .

Dashboard API

Hi all,

Thanks so much for your work on this package!

Here I'm requesting a Dashboard API -- maybe just a more transparent interface with the Dash backend so I can customize how the predictions and explanations are served. For example, I want to serve/access individual local explanations as .png's remotely by using some row ID. If this can be done easily on my end, please excuse my bothering you all!

Feature importance using Permutation

Hi,
At work, we use the "Permutation Importance" method to inspect feature importance.
We use the awesome library eli5 for that.

Would it be possible to include a version of that in this library?

Example of an interaction term from GA2M (a.k.a. EBM)?

GAMs are non-linear terms per feature, combined in a linear way.
GA2Ms also include pairwise interactions, chosen in a heuristically efficient way with FAST.

If I use an explainable boosting classifier/regressor, how can I tell whether it considered interaction terms?

Can you document an example where interaction terms are used, including graphs?

Thanks.

interpretation of local feature importances

Hi,
thank you for this nice package.

It would be helpful if the examples where extended with some documentation on how the visualized results are computed.

For instance, what puzzles me is the computation of the local feature importance values in relation to the predicted value considering the Explainable Boosting Classifier.

How do they relate to each other? Is there a relation between them (as the model is a GAM I would expect as a first guess some linear relation, but it doesn't add up)?

My apologies if the question reveals some ignorance on my side to standard literature.

Kind regards,

Tomas

##########
EDIT below:

by try and error, I believe I found the solution: the probability score given by the Explainable Boosting Classifier is computed as:

`
def compute_p(rs):
return 1./(1. + np.exp(-rs))

t = np.sum(ebm_local._internal_obj['specific'][0]['scores']) + ebm_local._internal_obj['specific'][0]['extra']['scores']
p = compute_p(t)
`

[BUG] Unexpected Error Message in the notebook

When running the Interpretable Classification Methods notebook, if explain_global or explain_local is called on ExplainableBoostingClassifier without fitting the model first, the NotFittedError is not raised. Instead an AttributeError is raised.

Similarly, when running the Interpretable Regression Methods notebook, if explain_global or explain_local is called on ExplainableBoostingRegressor without fitting the model first, the NotFittedError is not raised. Instead an AttributeError is raised.
interpret_error

Bump dash version

dash 1.0 was released on June 20, 2019 and 1.1 on August 5, 2019.

The currently supported dash==0.39 is from March 5, 2019.

Range-based instead of exact dependencies would help when building environments where other packages than interpret are installed.

NaNs cause hanging issue while training

Hey there, thank you for this awesome project!

At first I thought this was an issue with joblib but found out it had to do with NaNs in my dataframe.

Problem: when I try and train using NaNs the job basically just hangs and is pegging only one core. It seems like a better case would be to return an error or something because I had no idea it was just hanging since #7 isn't implemented yet.

image

I am running Fedora 30 with python 3.7.3.

Perhaps a simple check for NaNs in the array would be useful as an error to prevent this from happening? Lemme know how I can help and thank you! Great speech at Strata btw

Thank you

-Matt

OSError: exception: access violation reading 0x000002ABA0141000

Hi dear InterpretML Team,

I'm having this issue: OSError: exception: access violation reading 0x000002ABA0141000. This issue is with n_jobs=-1, when I set n_jobs=1 I got the same issue but access violation reading 0x0000026108C46000. I have no idea how to fix it. Seems that it comes from joblib! Here is what I'm trying to do on Jupyter notebook:

ebm = ExplainableBoostingClassifier(n_jobs=-1)
ebm.fit(x_train, y_train)
preds_interpret = ebm.predict_proba(x_test)

And here the traceback:

Traceback (most recent call last):
File "C:\anaconda\lib\site-packages\joblib\externals\loky\process_executor.py", line 418, in _process_worker
r = call_item()
File "C:\anaconda\lib\site-packages\joblib\externals\loky\process_executor.py", line 272, in call
return self.fn(*self.args, **self.kwargs)
File "C:\anaconda\lib\site-packages\joblib_parallel_backends.py", line 567, in call
return self.func(*args, **kwargs)
File "C:\anaconda\lib\site-packages\joblib\parallel.py", line 225, in call
for func, args, kwargs in self.items]
File "C:\anaconda\lib\site-packages\joblib\parallel.py", line 225, in
for func, args, kwargs in self.items]
File "C:\anaconda\lib\site-packages\interpret\glassbox\ebm\ebm.py", line 789, in train_model
return estimator.fit(X, y)
File "C:\anaconda\lib\site-packages\interpret\glassbox\ebm\ebm.py", line 386, in fit
validation_scores=None,
File "C:\anaconda\lib\site-packages\interpret\glassbox\ebm\internal.py", line 376, in init
self._initialize_training_classification()
File "C:\anaconda\lib\site-packages\interpret\glassbox\ebm\internal.py", line 461, in _initialize_training_classification
self.num_inner_bags,
OSError: exception: access violation reading 0x000002ABA0141000

how to graph without spinning a local web-server

Is this API available yet? We cant seem to plot it locally with plotly offline.

Hi @dfrankow, thanks for the issue! We're just about to introduce a few new API changes that should make this easier in our next release. One, we'll let you specify a port in the show method, so that you can pick your own port that you know is open. Second, we'll introduce a new function that doesn't spin up the local web-server, and directly uses plotly to visualize it. For now, here are a few notes:

visualize() does return a plotly object, and you can use plotly.offline so that you don't need an api key. And yes, if you pass in a key to visualize() , you can get a specific graph back out!

If you run this code at the top of your notebook:

from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)

you can then use "iplot(plotly_figure)" in your notebook to get a direct plotly graph. We'll have a nicer API around this soon!

Originally posted by @interpret-ml in #1 (comment)

"127.0.0.1 refused to connect" error after re-opening Jupyter notebook

Thanks for this library!

I have been able to run everything as it is following the README.md, However, when I re-open the Jupyter notebook I have been using, instead of the images created with the show(hist) , show(ebm_perf) commands I am getting the following error: 127.0.0.1 refused to connect.

I have read the different issues and some similar problem happened on #14, but I couldn't figure out how to fix the issue myself. The same happens if I download the jupyter notebook in .html format and I open it after closing the Jupyter notebook connection.

Can anyone throw some light on it?

Could not find open port

I am getting an error "RuntimeError: Could not find open port" when I use show comment. Is there a way to set up the ip address and port number to run the dashboard. Thanks

import pandas as pd
from sklearn.model_selection import train_test_split

df = pd.read_csv(
"https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data",
header=None)
df.columns = [
"Age", "WorkClass", "fnlwgt", "Education", "EducationNum",
"MaritalStatus", "Occupation", "Relationship", "Race", "Gender",
"CapitalGain", "CapitalLoss", "HoursPerWeek", "NativeCountry", "Income"
]
train_cols = df.columns[0:-1]
label = df.columns[-1]
X = df[train_cols]
y = df[label].apply(lambda x: 0 if x == " <=50K" else 1) #Turning response into 0 and 1

seed = 1
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed)

from interpret import show
from interpret.data import ClassHistogram

hist = ClassHistogram().explain_data(X_train, y_train, name = 'Train Data')
show(hist)


RuntimeError Traceback (most recent call last)
in
3
4 hist = ClassHistogram().explain_data(X_train, y_train, name = 'Train Data')
----> 5 show(hist)

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/visual/interactive.py in show(explanation, share_tables)
120 except Exception as e: # pragma: no cover
121 log.error(e, exc_info=True)
--> 122 raise e
123
124 return None

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/visual/interactive.py in show(explanation, share_tables)
110 # Initialize server if needed
111 if this.app_runner is None: # pragma: no cover
--> 112 init_show_server(this.app_addr)
113
114 # Register

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/visual/interactive.py in init_show_server(addr, base_url, use_relative_links)
83 log.debug("Create app runner at {0}".format(addr))
84 this.app_runner = AppRunner(
---> 85 addr, base_url=base_url, use_relative_links=use_relative_links
86 )
87 this.app_runner.start()

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/visual/dashboard.py in init(self, addr, base_url, use_relative_links)
58 msg = "Could not find open port"
59 log.error(msg)
---> 60 raise RuntimeError(msg)
61 else:
62 self.ip = addr[0]

RuntimeError: Could not find open port

EBM parameter search with Ray Tune

Hi!

I'm using Tune to search for optimal set of hyperparameters for different models. It works without issues for CatBoost and Keras models, however, I haven't been able to successfully run it with EBM on more than one CPU, let alone on GPU (it seems that EBM can't be run on GPU at all). When I haven't explicitly set n_jobs=1, I get this warning:

UserWarning: Loky-backed parallel loops cannot be nested below threads, setting n_jobs=1

Do you know is there something I'm missing or is it not possible to run EBM with Tune on multiple CPUs?

Code to reproduce

import ray
from ray import tune

ray.init(ignore_reinit_error=True)

import numpy as np
from interpret.glassbox import ExplainableBoostingClassifier

def train_ebm(config, reporter):
    n_samples = 100000
    x_train = np.random.rand(n_samples, 3)  # Random training data with 3 features
    y_train = np.random.randint(0, 2, n_samples)  # Random binary labels
    
    model = ExplainableBoostingClassifier(
        learning_rate=config['lr'], 
        n_jobs=12
    )
    
    model.fit(x_train, y_train)
    
    reporter()

ebm_experiment = tune.Experiment(
    name='ebm_test', 
    run=train_ebm, 
    num_samples=1,
    resources_per_trial={
        'gpu': 0,
        'cpu': 12
    },
    config={
        'lr': 0.01,
    }
)

trials = tune.run_experiments(ebm_experiment)

Versions
ray: 0.7.3
interpret: 0.1.15

Bug in Notebook for reproducing table

In the UCI heart disease dataset used in the load_heart_dataset function in the notebook, the label is given the second last column instead of the last column. The last column in the dataset corresponds to a location variable (and takes values such as 'Hungary', 'VA' etc).

How to get categorical variables in the graph?

I am opening an issue because I am trying to reproduce the results obtained in this graph:

image

I would like to get the categorical features as text, just as it can be seen in the image. However, when doing it by myself, if I don't convert the categorical values to numeric I am getting an error, and therefore I am not able to reproduce the graph as it gets the number the variable was encoded with:

image

Do you know why is this happening and what should I do in order to get it with the text? Thanks

Weights on data

Hi!

First thanks for your amazing work.
Would it be possible to take into account weights on the data samples, similar to what is done in a few models of sklearn?
And a last unrelated question: will you consider the option of adding alternatives for fitting GAMs (other than boosting, for instance using splines)?

Thanks,

Use graphs in a Jupyter notebook?

Thanks for this library.

I'm following along with the README.md and got to:

from interpret import show

ebm_global = ebm.explain_global()
show(ebm_global)

When I run that in my Jupyter notebook I get: RuntimeError: Could not find open port.

Maybe it's trying to run a web server from a notebook?

Can I just make the individual graphs in the notebook? How?

I see functions in interpret.visual.plot, but I'm having a bit of trouble finding the right objects to pass to it.

Install Fails - Greenlet

I am trying to pip install interpret and currently the following error appears to be stopping the install. Any ideas for getting around this and proceeding?

I am using Ubuntu 16.04
Cannot uninstall 'greenlet'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

Plotting Questions

All-
I fit a model : ebm.fit(X,y). Then I am able to see overall feature importance running:

import plotly 
plotly.tools.set_credentials_file(username='XXX', api_key='123')
import plotly.plotly as plotly_py
ebm_global = ebm.explain_global()
plotly_py.iplot(ebm_global.visualize())

Question 1: Is there a way to see more than the default top features?

Then if we pass an index into visualize, we get the shape of the effect of that feature:

plotly_py.iplot(ebm_global.visualize(0))

I can see local explanations:

ebm_local = ebm.explain_local(X,y)
plotly_py.iplot(ebm_local.visualize(20))

QUESTION 2: IS it possible to get this data back and not plot it?

Question 3:

How does the above differ from show()? I have been unable to get this to work but it looks like the only difference is a drop down selector?

Installation failed on Windows 10

Installation failed on Windows 10
it is shame Microsoft can not develop package to be in installed on Windows

error message is

  Found existing installation: greenlet 0.4.12
Cannot uninstall 'greenlet'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

all installation history

Microsoft Windows [Version 10.0.17763.557]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\Users\sndr\Downloads>pip install numpy scipy pyscaffold
Requirement already satisfied: numpy in c:\users\sndr\anaconda3\lib\site-packages (1.15.4)
Requirement already satisfied: scipy in c:\users\sndr\anaconda3\lib\site-packages (1.0.0)
Collecting pyscaffold
  Downloading https://files.pythonhosted.org/packages/d3/3f/0ce77998683cb7967ba7d98b114b8a6a954a731b812f455dee57f1636853/PyScaffold-3.1-py3-none-any.whl (163kB)
    100% |████████████████████████████████| 174kB 75kB/s
Requirement already satisfied: setuptools>=38.3 in c:\users\sndr\anaconda3\lib\site-packages (from pyscaffold) (40.8.0)
Installing collected packages: pyscaffold
Successfully installed pyscaffold-3.1

C:\Users\sndr\Downloads>pip install -U interpret
Collecting interpret
  Downloading https://files.pythonhosted.org/packages/d8/0c/3b4b55e69dad95131126ffb3eaa7a8b2f43e7796775aa5dd8123531fab8a/interpret-0.1.9-py3-none-any.whl (4.1MB)
    100% |████████████████████████████████| 4.1MB 3.3MB/s
Collecting pytest>=4.3.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/b3/eb/df264c0b1ff4aaf263375dc09aabd9093364f66060be9b26f3a2c166d558/pytest-4.6.3-py2.py3-none-any.whl (229kB)
    100% |████████████████████████████████| 235kB 6.0MB/s
Collecting gevent>=1.4.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/51/97/2e1e8aa7ea27171c3e249480d382e78b49ab4cead5dafb2124d2a1b58a83/gevent-1.4.0-cp36-cp36m-win_amd64.whl (3.0MB)
    100% |████████████████████████████████| 3.0MB 3.8MB/s
Collecting skope-rules>=1.0.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/56/b0/b56fb8d186f35089a469dc788c32ac99cf0276eae567736325b179b71db0/skope-rules-1.0.0.tar.gz (2.0MB)
    100% |████████████████████████████████| 2.0MB 3.4MB/s
Collecting dash==0.39.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/38/c0/353ba9f56f171389f0b4985f0481805219fc1921d651586c51345b89c1ea/dash-0.39.0.tar.gz (40kB)
    100% |████████████████████████████████| 40kB 2.6MB/s
Requirement already satisfied, skipping upgrade: nbconvert>=5.4.1 in c:\users\sndr\anaconda3\lib\site-packages (from interpret) (5.4.1)
Collecting dash-table-experiments==0.6.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/6f/4a/e201fe7419a250c35635fb0b81f3cba8cf19ed4e3663fda6cd08e7bd0655/dash_table_experiments-0.6.0.tar.gz (738kB)
    100% |████████████████████████████████| 747kB 4.5MB/s
Collecting dash-renderer==0.20.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/c4/dd/f686321d054bb1e145d3a7d1f6600516de535b0d597bcf7701dbb96b1262/dash_renderer-0.20.0.tar.gz (920kB)
    100% |████████████████████████████████| 921kB 5.4MB/s
Collecting SALib>=1.3.3 (from interpret)
  Downloading https://files.pythonhosted.org/packages/12/8b/14f6c0f0a12b29d5e1766e7a585269cd6ec9728a63886c161a6eddb4e7fa/SALib-1.3.7.tar.gz (854kB)
    100% |████████████████████████████████| 860kB 3.8MB/s
Collecting lime>=0.1.1.33 (from interpret)
  Downloading https://files.pythonhosted.org/packages/07/20/a4a59ed562610e19fea333da48bb5fab978a72acbe8e831930f444cd69c9/lime-0.1.1.34.tar.gz (272kB)
    100% |████████████████████████████████| 276kB 3.8MB/s
Collecting shap>=0.28.5 (from interpret)
  Downloading https://files.pythonhosted.org/packages/5d/34/4a3e429f969cc69ab4e910154360adab3f56cdde02a42f12e170625e71e1/shap-0.29.1-cp36-cp36m-win_amd64.whl (258kB)
    100% |████████████████████████████████| 266kB 2.8MB/s
Collecting ipython>=7.4.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/a9/2e/41dce4ed129057e05a555a7f9629aa2d5f81fdcd4d16568bc24b75a1d2c9/ipython-7.5.0-py3-none-any.whl (770kB)
    100% |████████████████████████████████| 778kB 6.5MB/s
Collecting dash-core-components==0.44.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/07/8b/e7193b60288f62c6c40da7d3fdbd01ccdc6752dbf25e9ef60912a5948938/dash_core_components-0.44.0.tar.gz (4.2MB)
    100% |████████████████████████████████| 4.2MB 3.8MB/s
Collecting scipy>=1.2.1 (from interpret)
  Downloading https://files.pythonhosted.org/packages/9e/fd/9a995b7fc18c6c17ce570b3cfdabffbd2718e4f1830e94777c4fd66e1179/scipy-1.3.0-cp36-cp36m-win_amd64.whl (30.5MB)
    100% |████████████████████████████████| 30.5MB 1.1MB/s
Collecting psutil>=5.6.2 (from interpret)
  Downloading https://files.pythonhosted.org/packages/86/91/f15a3aae2af13f008ed95e02292d1a2e84615ff42b7203357c1c0bbe0651/psutil-5.6.3-cp36-cp36m-win_amd64.whl (234kB)
    100% |████████████████████████████████| 235kB 3.5MB/s
Requirement already satisfied, skipping upgrade: pandas>=0.24.0 in c:\users\sndr\anaconda3\lib\site-packages (from interpret) (0.24.2)
Requirement already satisfied, skipping upgrade: ipykernel>=5.1.0 in c:\users\sndr\anaconda3\lib\site-packages (from interpret) (5.1.0)
Collecting dash-html-components==0.14.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/08/1f/943c0f90d957fdff6c5968ea80694b2959d0b0ec959be17a1478e3c97e5a/dash_html_components-0.14.0.tar.gz (537kB)
    100% |████████████████████████████████| 542kB 5.9MB/s
Collecting scikit-learn>=0.20.0 (from interpret)
  Downloading https://files.pythonhosted.org/packages/a9/bc/18663f6d75838b73353ba49fabd631347e68470ec9e623d7b3f3ccd4f426/scikit_learn-0.21.2-cp36-cp36m-win_amd64.whl (5.9MB)
    100% |████████████████████████████████| 5.9MB 3.4MB/s
Collecting plotly>=3.8.1 (from interpret)
  Downloading https://files.pythonhosted.org/packages/ff/75/3982bac5076d0ce6d23103c03840fcaec90c533409f9d82c19f54512a38a/plotly-3.10.0-py2.py3-none-any.whl (41.5MB)
    100% |████████████████████████████████| 41.5MB 390kB/s
Requirement already satisfied, skipping upgrade: joblib>=0.12.5 in c:\users\sndr\anaconda3\lib\site-packages (from interpret) (0.13.1)
Collecting dash-cytoscape==0.1.1 (from interpret)
  Downloading https://files.pythonhosted.org/packages/aa/93/d9db22331dcad4a055631372816bf4544a1a1a852fb2fa3a2905c6682198/dash_cytoscape-0.1.1.tar.gz (3.4MB)
    100% |████████████████████████████████| 3.4MB 3.2MB/s
Collecting pytest-runner>=4.4 (from interpret)
  Downloading https://files.pythonhosted.org/packages/f8/31/f291d04843523406f242e63b5b90f7b204a756169b4250ff213e10326deb/pytest_runner-5.1-py2.py3-none-any.whl
Requirement already satisfied, skipping upgrade: numpy>=1.15.1 in c:\users\sndr\anaconda3\lib\site-packages (from interpret) (1.15.4)
Requirement already satisfied, skipping upgrade: six>=1.10.0 in c:\users\sndr\anaconda3\lib\site-packages (from pytest>=4.3.0->interpret) (1.11.0)
Requirement already satisfied, skipping upgrade: py>=1.5.0 in c:\users\sndr\anaconda3\lib\site-packages (from pytest>=4.3.0->interpret) (1.5.2)
Requirement already satisfied, skipping upgrade: colorama; sys_platform == "win32" in c:\users\sndr\anaconda3\lib\site-packages (from pytest>=4.3.0->interpret) (0.3.9)
Collecting pluggy<1.0,>=0.12 (from pytest>=4.3.0->interpret)
  Downloading https://files.pythonhosted.org/packages/06/ee/de89e0582276e3551df3110088bf20844de2b0e7df2748406876cc78e021/pluggy-0.12.0-py2.py3-none-any.whl
Collecting importlib-metadata>=0.12 (from pytest>=4.3.0->interpret)
  Downloading https://files.pythonhosted.org/packages/bd/23/dce4879ec58acf3959580bfe769926ed8198727250c5e395e6785c764a02/importlib_metadata-0.18-py2.py3-none-any.whl
Requirement already satisfied, skipping upgrade: attrs>=17.4.0 in c:\users\sndr\anaconda3\lib\site-packages (from pytest>=4.3.0->interpret) (17.4.0)
Requirement already satisfied, skipping upgrade: packaging in c:\users\sndr\anaconda3\lib\site-packages (from pytest>=4.3.0->interpret) (16.8)
Requirement already satisfied, skipping upgrade: wcwidth in c:\users\sndr\anaconda3\lib\site-packages (from pytest>=4.3.0->interpret) (0.1.7)
Collecting atomicwrites>=1.0 (from pytest>=4.3.0->interpret)
  Downloading https://files.pythonhosted.org/packages/52/90/6155aa926f43f2b2a22b01be7241be3bfd1ceaf7d0b3267213e8127d41f4/atomicwrites-1.3.0-py2.py3-none-any.whl
Requirement already satisfied, skipping upgrade: more-itertools>=4.0.0; python_version > "2.7" in c:\users\sndr\anaconda3\lib\site-packages (from pytest>=4.3.0->interpret) (6.0.0)
Collecting greenlet>=0.4.14; platform_python_implementation == "CPython" (from gevent>=1.4.0->interpret)
  Downloading https://files.pythonhosted.org/packages/a9/a3/2a7a15c2dc23f764eaed46d41e081659aadf45570b4170156dde1c76d4f7/greenlet-0.4.15-cp36-cp36m-win_amd64.whl
Collecting cffi>=1.11.5; sys_platform == "win32" and platform_python_implementation == "CPython" (from gevent>=1.4.0->interpret)
  Downloading https://files.pythonhosted.org/packages/f1/b5/ca3583cbf7975f53b030be773caeabd4e19bac467714e525eaff447a8ac8/cffi-1.12.3-cp36-cp36m-win_amd64.whl (171kB)
    100% |████████████████████████████████| 174kB 2.1MB/s
Requirement already satisfied, skipping upgrade: Flask>=0.12 in c:\users\sndr\anaconda3\lib\site-packages (from dash==0.39.0->interpret) (0.12.2)
Collecting flask-compress (from dash==0.39.0->interpret)
  Downloading https://files.pythonhosted.org/packages/0e/2a/378bd072928f6d92fd8c417d66b00c757dc361c0405a46a0134de6fd323d/Flask-Compress-1.4.0.tar.gz
Collecting dash-table==3.6.0 (from dash==0.39.0->interpret)
  Downloading https://files.pythonhosted.org/packages/a3/3a/eae584bb7eccdf93d2931c4ebf43e55937cf22d51ad63551241fc83d68fc/dash_table-3.6.0.tar.gz (468kB)
    100% |████████████████████████████████| 471kB 5.1MB/s
Requirement already satisfied, skipping upgrade: mistune>=0.8.1 in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (0.8.3)
Requirement already satisfied, skipping upgrade: jinja2 in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (2.10)
Requirement already satisfied, skipping upgrade: pygments in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (2.2.0)
Requirement already satisfied, skipping upgrade: traitlets>=4.2 in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (4.3.2)
Requirement already satisfied, skipping upgrade: jupyter_core in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (4.4.0)
Requirement already satisfied, skipping upgrade: nbformat>=4.4 in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (4.4.0)
Requirement already satisfied, skipping upgrade: entrypoints>=0.2.2 in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (0.2.3)
Requirement already satisfied, skipping upgrade: bleach in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (3.1.0)
Requirement already satisfied, skipping upgrade: pandocfilters>=1.4.1 in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (1.4.2)
Requirement already satisfied, skipping upgrade: testpath in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (0.3.1)
Requirement already satisfied, skipping upgrade: defusedxml in c:\users\sndr\anaconda3\lib\site-packages (from nbconvert>=5.4.1->interpret) (0.5.0)
Requirement already satisfied, skipping upgrade: matplotlib in c:\users\sndr\anaconda3\lib\site-packages (from SALib>=1.3.3->interpret) (2.2.2)
Requirement already satisfied, skipping upgrade: scikit-image>=0.12 in c:\users\sndr\anaconda3\lib\site-packages (from lime>=0.1.1.33->interpret) (0.13.1)
Requirement already satisfied, skipping upgrade: tqdm in c:\users\sndr\anaconda3\lib\site-packages (from shap>=0.28.5->interpret) (4.26.0)
Collecting prompt-toolkit<2.1.0,>=2.0.0 (from ipython>=7.4.0->interpret)
  Downloading https://files.pythonhosted.org/packages/f7/a7/9b1dd14ef45345f186ef69d175bdd2491c40ab1dfa4b2b3e4352df719ed7/prompt_toolkit-2.0.9-py3-none-any.whl (337kB)
    100% |████████████████████████████████| 337kB 6.8MB/s
Requirement already satisfied, skipping upgrade: pickleshare in c:\users\sndr\anaconda3\lib\site-packages (from ipython>=7.4.0->interpret) (0.7.4)
Requirement already satisfied, skipping upgrade: jedi>=0.10 in c:\users\sndr\anaconda3\lib\site-packages (from ipython>=7.4.0->interpret) (0.11.1)
Requirement already satisfied, skipping upgrade: setuptools>=18.5 in c:\users\sndr\anaconda3\lib\site-packages (from ipython>=7.4.0->interpret) (40.8.0)
Requirement already satisfied, skipping upgrade: decorator in c:\users\sndr\anaconda3\lib\site-packages (from ipython>=7.4.0->interpret) (4.2.1)
Collecting backcall (from ipython>=7.4.0->interpret)
  Downloading https://files.pythonhosted.org/packages/84/71/c8ca4f5bb1e08401b916c68003acf0a0655df935d74d93bf3f3364b310e0/backcall-0.1.0.tar.gz
Requirement already satisfied, skipping upgrade: pytz>=2011k in c:\users\sndr\anaconda3\lib\site-packages (from pandas>=0.24.0->interpret) (2018.9)
Requirement already satisfied, skipping upgrade: python-dateutil>=2.5.0 in c:\users\sndr\anaconda3\lib\site-packages (from pandas>=0.24.0->interpret) (2.6.1)
Requirement already satisfied, skipping upgrade: jupyter-client in c:\users\sndr\anaconda3\lib\site-packages (from ipykernel>=5.1.0->interpret) (5.2.4)
Requirement already satisfied, skipping upgrade: tornado>=4.2 in c:\users\sndr\anaconda3\lib\site-packages (from ipykernel>=5.1.0->interpret) (4.5.3)
Requirement already satisfied, skipping upgrade: requests in c:\users\sndr\anaconda3\lib\site-packages (from plotly>=3.8.1->interpret) (2.18.4)
Collecting retrying>=1.3.3 (from plotly>=3.8.1->interpret)
  Downloading https://files.pythonhosted.org/packages/44/ef/beae4b4ef80902f22e3af073397f079c96969c69b2c7d52a57ea9ae61c9d/retrying-1.3.3.tar.gz
Collecting zipp>=0.5 (from importlib-metadata>=0.12->pytest>=4.3.0->interpret)
  Downloading https://files.pythonhosted.org/packages/a0/0f/9bf71d438d2e9d5fd0e4569ea4d1a2b6f5a524c234c6d221b494298bb4d1/zipp-0.5.1-py2.py3-none-any.whl
Requirement already satisfied, skipping upgrade: pyparsing in c:\users\sndr\anaconda3\lib\site-packages (from packaging->pytest>=4.3.0->interpret) (2.2.0)
Requirement already satisfied, skipping upgrade: pycparser in c:\users\sndr\anaconda3\lib\site-packages (from cffi>=1.11.5; sys_platform == "win32" and platform_python_implementation == "CPython"->gevent>=1.4.0->interpret) (2.18)
Requirement already satisfied, skipping upgrade: Werkzeug>=0.7 in c:\users\sndr\anaconda3\lib\site-packages (from Flask>=0.12->dash==0.39.0->interpret) (0.14.1)
Requirement already satisfied, skipping upgrade: itsdangerous>=0.21 in c:\users\sndr\anaconda3\lib\site-packages (from Flask>=0.12->dash==0.39.0->interpret) (0.24)
Requirement already satisfied, skipping upgrade: click>=2.0 in c:\users\sndr\anaconda3\lib\site-packages (from Flask>=0.12->dash==0.39.0->interpret) (6.7)
Requirement already satisfied, skipping upgrade: MarkupSafe>=0.23 in c:\users\sndr\anaconda3\lib\site-packages (from jinja2->nbconvert>=5.4.1->interpret) (1.0)
Requirement already satisfied, skipping upgrade: ipython_genutils in c:\users\sndr\anaconda3\lib\site-packages (from traitlets>=4.2->nbconvert>=5.4.1->interpret) (0.2.0)
Requirement already satisfied, skipping upgrade: jsonschema!=2.5.0,>=2.4 in c:\users\sndr\anaconda3\lib\site-packages (from nbformat>=4.4->nbconvert>=5.4.1->interpret) (2.6.0)
Requirement already satisfied, skipping upgrade: webencodings in c:\users\sndr\anaconda3\lib\site-packages (from bleach->nbconvert>=5.4.1->interpret) (0.5.1)
Requirement already satisfied, skipping upgrade: cycler>=0.10 in c:\users\sndr\anaconda3\lib\site-packages (from matplotlib->SALib>=1.3.3->interpret) (0.10.0)
Requirement already satisfied, skipping upgrade: kiwisolver>=1.0.1 in c:\users\sndr\anaconda3\lib\site-packages (from matplotlib->SALib>=1.3.3->interpret) (1.0.1)
Requirement already satisfied, skipping upgrade: networkx>=1.8 in c:\users\sndr\anaconda3\lib\site-packages (from scikit-image>=0.12->lime>=0.1.1.33->interpret) (1.11)
Requirement already satisfied, skipping upgrade: pillow>=2.1.0 in c:\users\sndr\anaconda3\lib\site-packages (from scikit-image>=0.12->lime>=0.1.1.33->interpret) (5.0.0)
Requirement already satisfied, skipping upgrade: PyWavelets>=0.4.0 in c:\users\sndr\anaconda3\lib\site-packages (from scikit-image>=0.12->lime>=0.1.1.33->interpret) (0.5.2)
Requirement already satisfied, skipping upgrade: parso==0.1.* in c:\users\sndr\anaconda3\lib\site-packages (from jedi>=0.10->ipython>=7.4.0->interpret) (0.1.1)
Requirement already satisfied, skipping upgrade: pyzmq>=13 in c:\users\sndr\anaconda3\lib\site-packages (from jupyter-client->ipykernel>=5.1.0->interpret) (16.0.3)
Requirement already satisfied, skipping upgrade: chardet<3.1.0,>=3.0.2 in c:\users\sndr\anaconda3\lib\site-packages (from requests->plotly>=3.8.1->interpret) (3.0.4)
Requirement already satisfied, skipping upgrade: idna<2.7,>=2.5 in c:\users\sndr\anaconda3\lib\site-packages (from requests->plotly>=3.8.1->interpret) (2.6)
Requirement already satisfied, skipping upgrade: urllib3<1.23,>=1.21.1 in c:\users\sndr\anaconda3\lib\site-packages (from requests->plotly>=3.8.1->interpret) (1.22)
Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in c:\users\sndr\anaconda3\lib\site-packages (from requests->plotly>=3.8.1->interpret) (2019.3.9)
Building wheels for collected packages: skope-rules, dash, dash-table-experiments, dash-renderer, SALib, lime, dash-core-components, dash-html-components, dash-cytoscape, flask-compress, dash-table, backcall, retrying
  Building wheel for skope-rules (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\3e\8d\56\464f328ff3200c785626967ee39a6b2efc455469dab615f03e
  Building wheel for dash (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\fb\75\e5\278d80ca56f3c1d623565079cacf3db4e672948d34311e0c91
  Building wheel for dash-table-experiments (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\17\46\7c\936c2a123c17673d9f46ecc74e1692a118673009bc92c192ae
  Building wheel for dash-renderer (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\6f\33\33\6473598a2a280dcfe8507b020b66da25dafe063fff31bb28f6
  Building wheel for SALib (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\73\94\42\b5160b20f13581c0e7e4d9bc0afa77828900296f8bca82bafe
  Building wheel for lime (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\2f\8e\c1\c1cddd9cf8fbae812904fa5c84ef571e782891288d309d04c8
  Building wheel for dash-core-components (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\83\ac\bb\68cefc4f1e6ec359183f3d198cadbec07193b1e3087256a5a2
  Building wheel for dash-html-components (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\72\e5\cd\a82fd0f01affb14d3f3f19a19407f32a1845825603a7f9664b
  Building wheel for dash-cytoscape (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\32\bd\34\4c0a61c252c4bcee42ab4943e66e7c2d1f7809de90d4caf070
  Building wheel for flask-compress (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\96\32\88\a1f6d9dd3c29570ab3a8acc0d556b3b20abcf3c623c868ce0a
  Building wheel for dash-table (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\b9\7e\8a\1249b5961f59668eba0471800e618c47b4219f77e2887536bd
  Building wheel for backcall (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\98\b0\dd\29e28ff615af3dda4c67cab719dd51357597eabff926976b45
  Building wheel for retrying (setup.py) ... done
  Stored in directory: C:\Users\sndr\AppData\Local\pip\Cache\wheels\d7\a9\33\acc7b709e2a35caa7d4cae442f6fe6fbf2c43f80823d46460c
Successfully built skope-rules dash dash-table-experiments dash-renderer SALib lime dash-core-components dash-html-components dash-cytoscape flask-compress dash-table backcall retrying
jupyter-console 5.2.0 has requirement prompt_toolkit<2.0.0,>=1.0.0, but you'll have prompt-toolkit 2.0.9 which is incompatible.
datashader 0.6.9 has requirement dask[complete]>=0.18.0, but you'll have dask 0.16.1 which is incompatible.
Installing collected packages: zipp, importlib-metadata, pluggy, atomicwrites, pytest, greenlet, cffi, gevent, scipy, scikit-learn, skope-rules, flask-compress, retrying, plotly, dash-renderer, dash-core-components, dash-html-components, dash-table, dash, dash-table-experiments, SALib, lime, prompt-toolkit, backcall, ipython, shap, psutil, dash-cytoscape, pytest-runner, interpret
  Found existing installation: pluggy 0.6.0
    Uninstalling pluggy-0.6.0:
      Successfully uninstalled pluggy-0.6.0
  Found existing installation: pytest 3.3.2
    Uninstalling pytest-3.3.2:
      Successfully uninstalled pytest-3.3.2
  Found existing installation: greenlet 0.4.12
Cannot uninstall 'greenlet'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

C:\Users\sndr\Downloads>

Segmentation faults since interpret 0.1.11

Hi, thank you for your great work.

I just tried updating from 0.1.10 to 0.1.11 using pip and I am getting segmentation faults.
The issue does not seem to originate from a memory resource limitation (I have a 10 GB free memory margin when using 0.1.10) .
I was not able to diagnose further.

Platform : x86_64, Linux, ubuntu 16

Code extract: (works fine with 0.1.10)

_ebm = ExplainableBoostingRegressor(
    n_estimators=16,
    learning_rate=0.1,
    early_stopping_run_length=10,
    data_n_episodes=200,
    n_jobs=1, # same error with n_jobs=-2
    feature_names=_categorical_features + _numerical_features,
    feature_types= ['categorical'] * len(_categorical_features) + ['continuous'] * len(_numerical_features))

_ebm.fit(X, y)  # X: 108,728 x 39 numpy array

GDB trace:

# Segmentation fault (core dumped):

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x2) at malloc.c:2951
2951    malloc.c: No such file or directory.
(gdb) bt
#0  __GI___libc_free (mem=0x2) at malloc.c:2951
#1  0x00007ffed376128c in DataSetAttributeCombination::~DataSetAttributeCombination() ()
   from /home/ubuntu/miniconda3/envs/env/lib/python3.7/site-packages/interpret/glassbox/ebm/../../lib/lib_ebmcore_linux_x64.so
#2  0x00007ffed377508d in FreeTraining ()
   from /home/ubuntu/miniconda3/envs/env/lib/python3.7/site-packages/interpret/glassbox/ebm/../../lib/lib_ebmcore_linux_x64.so

Python logs tail: (lots of INFO and ERROR messages since this release !).

[...]
INFO: Entered GetBestModel: ebmTraining=0x555e1f55ad90, indexAttributeCombination=38
INFO: Exited GetBestModel 0x555e24b7e6f0
INFO: Deallocation start
INFO: Entered FreeTraining: ebmTraining=0x555e1f55ad90
INFO: Entered ~EbmTrainingState
INFO: ~EbmTrainingState identified as regression type
INFO: Entered ~CachedTrainingThreadResources
INFO: Exited ~CachedTrainingThreadResources
INFO: Entered SamplingWithReplacement::FreeSamplingSets
INFO: Entered ~SamplingWithReplacement
INFO: Exited ~SamplingWithReplacement
INFO: Exited SamplingWithReplacement::FreeSamplingSets
INFO: Entered ~DataSetAttributeCombination

Segmentation fault (core dumped)

(Full log here: ebm_error_log.txt ).

pip install Error (pyscaffold)

Python 3.6
hello when i run

pip install  -U interpret

i had the following error
Could not find suitable distribution for Requirement.parse('pyscaffold<3.1a0,>=3.0a0')

so i dowgraded the pyscaffold package version as required

pip install pyscaffold==3.0.0

then i tried to install again
i got

pyscaffold.exceptions.OldSetuptools: Your setuptools version is too old (<30.3.0). Use `pip install -U setuptools` to upgrade.

the probelme is that my setuptools version is 41.2.0

thanks for help

Build Error

I'm trying to build a local version of demo.
Using VS 2017 and get an error during the build:

Severity Code Description Project File Line Suppression State
Error MSB8020 The build tools for v142 (Platform Toolset = 'v142') cannot be found. To build using the v142 build tools, please install v142 build tools. Alternatively, you may upgrade to the current Visual Studio tools by selecting the Project menu or right-click the solution, and then selecting "Retarget solution". ebmcore C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\VC\VCTargets\Microsoft.Cpp.Platform.targets 57

I tried a "Retarget solution" with no success.

Issue in barplot

Hi,
In ebm.explain_global() in categorical feature plots (barplots) the x axis values calculated seems wrong to me. For categorical extra nan is coming in 'specific' 'names' list, this leads to the length mismatch between 'names' and 'scores' the last categorical value is being ignored from the plot.
And there is no null values in my data.
For example,
{'density': {'names': [nan, 'abc', 'efg'], 'scores': [66486, 118521]}, 'lower_bounds': [-2.246691779337516, 1.1945678479167394], 'names': [nan, 'abc', 'efg'], 'scores': [-2.136401943158369, 1.1984443228864698], 'type': 'univariate', 'upper_bounds': [-2.026112106979222, 1.2023207978562003]}

How to capture the URL being generated each time?

Dear Interpret Team,

How can I capture the URL being generated for the dashboard. I need to capture this URL and integrate with third party UI. If you can suggest an idea to overcome this technical hurdle, it would be of great help.

Rakesh

TerminatedWorkerError

Got this erros on an machine with 72 cores and 137GB ram. The same data runs well with XGBoost.

I tryied to reduce the n_jobs and n_estimator, same error.

---------------------------------------------------------------------------
TerminatedWorkerError                     Traceback (most recent call last)
<ipython-input-46-777da5fd101f> in <module>
      1 import time
      2 before = time.time()
----> 3 ebm.fit(X_train.values, y_train)
      4 total = (time.time()-before)/60
      5 print(f"spent %.2f minutes" % total)

~/.local/lib/python3.6/site-packages/interpret/glassbox/ebm/ebm.py in fit(self, X, y)
    783         )
    784 
--> 785         estimators = provider.parallel(train_model, train_model_args_iter)
    786 
    787         if isinstance(self.interactions, int) and self.interactions > 0:

~/.local/lib/python3.6/site-packages/interpret/utils/distributed.py in parallel(self, compute_fn, compute_args_iter)
     16     def parallel(self, compute_fn, compute_args_iter):
     17         results = Parallel(n_jobs=self.n_jobs)(
---> 18             delayed(compute_fn)(*args) for args in compute_args_iter
     19         )
     20         # NOTE: Force gc, as Python does not free native memory easy.

~/.local/lib/python3.6/site-packages/joblib/parallel.py in __call__(self, iterable)
    932 
    933             with self._backend.retrieval_context():
--> 934                 self.retrieve()
    935             # Make sure that we get a last message telling us we are done
    936             elapsed_time = time.time() - self._start_time

~/.local/lib/python3.6/site-packages/joblib/parallel.py in retrieve(self)
    831             try:
    832                 if getattr(self._backend, 'supports_timeout', False):
--> 833                     self._output.extend(job.get(timeout=self.timeout))
    834                 else:
    835                     self._output.extend(job.get())

~/.local/lib/python3.6/site-packages/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
    519         AsyncResults.get from multiprocessing."""
    520         try:
--> 521             return future.result(timeout=timeout)
    522         except LokyTimeoutError:
    523             raise TimeoutError()

/usr/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
    430                 raise CancelledError()
    431             elif self._state == FINISHED:
--> 432                 return self.__get_result()
    433             else:
    434                 raise TimeoutError()

/usr/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker. The exit codes of the workers are {SIGSEGV(-11)}

Some details:

~$ python3 --version
Python 3.6.7
~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.2 LTS
Release:	18.04
Codename:	bionic
~$ pip3 freeze
asn1crypto==0.24.0
atomicwrites==1.3.0
attrs==19.1.0
Automat==0.6.0
backcall==0.1.0
bleach==3.1.0
blinker==1.4
certifi==2019.3.9
chardet==3.0.4
Click==7.0
cloud-init==18.4
cloudpickle==0.8.1
colorama==0.3.7
command-not-found==0.3
configobj==5.0.6
constantly==15.1.0
cryptography==2.1.4
cycler==0.10.0
dash==0.39.0
dash-core-components==0.44.0
dash-cytoscape==0.1.1
dash-html-components==0.14.0
dash-renderer==0.20.0
dash-table==3.6.0
dash-table-experiments==0.6.0
dask==1.2.0
decorator==4.4.0
defusedxml==0.6.0
distro-info==0.18
entrypoints==0.3
Flask==1.0.2
Flask-Compress==1.4.0
gevent==1.4.0
greenlet==0.4.15
hibagent==1.0.1
httplib2==0.9.2
hyperlink==17.3.1
hypothesis==4.23.5
idna==2.8
imageio==2.5.0
incremental==16.10.1
interpret==0.1.1
ipykernel==5.1.1
ipython==7.5.0
ipython-genutils==0.2.0
ipywidgets==7.4.2
itsdangerous==1.1.0
jedi==0.13.3
Jinja2==2.10.1
joblib==0.13.2
jsonpatch==1.16
jsonpointer==1.10
jsonschema==3.0.1
jupyter==1.0.0
jupyter-client==5.2.4
jupyter-console==6.0.0
jupyter-core==4.4.0
keyring==10.6.0
keyrings.alt==3.0
kiwisolver==1.1.0
language-selector==0.1
lime==0.1.1.34
locket==0.2.0
MarkupSafe==1.1.1
matplotlib==3.0.3
mistune==0.8.4
more-itertools==7.0.0
nbconvert==5.5.0
nbformat==4.4.0
netifaces==0.10.4
networkx==2.3
notebook==5.7.8
numpy==1.16.3
oauthlib==2.0.6
PAM==0.4.2
pandas==0.24.2
pandocfilters==1.4.2
parso==0.4.0
partd==0.3.10
pexpect==4.7.0
pickleshare==0.7.5
Pillow==6.0.0
plotly==3.9.0
pluggy==0.11.0
prometheus-client==0.6.0
prompt-toolkit==2.0.9
ptyprocess==0.6.0
py==1.8.0
pyasn1==0.4.2
pyasn1-modules==0.2.1
pycrypto==2.6.1
Pygments==2.4.0
pygobject==3.26.1
PyJWT==1.5.3
pyOpenSSL==17.5.0
pyparsing==2.4.0
pyrsistent==0.15.2
pyserial==3.4
pytest==4.5.0
pytest-runner==4.4
python-apt==1.6.3+ubuntu1
python-dateutil==2.8.0
python-debian==0.1.32
pytz==2019.1
PyWavelets==1.0.3
pyxdg==0.25
PyYAML==3.12
pyzmq==18.0.1
qtconsole==4.4.3
requests==2.22.0
requests-unixsocket==0.1.5
retrying==1.3.3
SALib==1.3.4
scikit-image==0.15.0
scikit-learn==0.21.1
scipy==1.2.1
SecretStorage==2.3.1
Send2Trash==1.5.0
service-identity==16.0.0
shap==0.28.5
six==1.12.0
sklearn==0.0
skope-rules==1.0.0
ssh-import-id==5.7
systemd-python==234
terminado==0.8.2
testpath==0.4.2
toolz==0.9.0
tornado==6.0.2
tqdm==4.32.1
traitlets==4.3.2
Twisted==17.9.0
ufw==0.35
unattended-upgrades==0.1
urllib3==1.25.2
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.15.4
widgetsnbextension==3.4.2
xgboost==0.82
zope.interface==4.3.2

Custom validation set

Hi,
Would it be possible to add an option to specify the validation data in EBM fit() function instead of letting sklearn calculate it?

Dependency on Matplotlib 2.1.0 in Linux

Hi ,

I'm trying to install on a linux docker. It seems there is a dependency on the older version of matplotlib and it does not install fine. The latest version of matplotlib 3.0.3 is installing fine. Can we please upgade the dependency or else please suggest me a workaround.

ERROR: Complete output from command python setup.py egg_info:
ERROR: IMPORTANT WARNING:
pkg-config is not installed.
matplotlib may not be able to find some of its dependencies
============================================================================
Edit setup.cfg to change the build options

BUILDING MATPLOTLIB
            matplotlib: yes [2.1.0]
                python: yes [3.7.3 | packaged by conda-forge | (default, Mar
                        27 2019, 23:01:00)  [GCC 7.3.0]]
              platform: yes [linux]

REQUIRED DEPENDENCIES AND EXTENSIONS
                 numpy: yes [version 1.16.3]
                   six: yes [using six version 1.12.0]
              dateutil: yes [using dateutil version 2.8.0]
backports.functools_lru_cache: yes [Not required]
          subprocess32: yes [Not required]
                  pytz: yes [using pytz version 2019.1]
                cycler: yes [using cycler version 0.10.0]
               tornado: yes [using tornado version 6.0.2]
             pyparsing: yes [using pyparsing version 2.4.0]
                libagg: yes [pkg-config information for 'libagg' could not
                        be found. Using local copy.]
              freetype: no  [The C/C++ header for freetype2 (ft2build.h)
                        could not be found.  You may need to install the
                        development package.]
                   png: no  [pkg-config information for 'libpng' could not
                        be found.]
                 qhull: yes [pkg-config information for 'libqhull' could not
                        be found. Using local copy.]

OPTIONAL SUBPACKAGES
           sample_data: yes [installing]
              toolkits: yes [installing]
                 tests: no  [skipping due to configuration]
        toolkits_tests: no  [skipping due to configuration]

OPTIONAL BACKEND EXTENSIONS
                macosx: no  [Mac OS-X only]
                qt5agg: no  [PySide2 not found; PyQt5 not found]
                qt4agg: no  [PySide not found; PyQt4 not found]
               gtk3agg: no  [Requires pygobject to be installed.]
             gtk3cairo: no  [Requires cairocffi or pycairo to be installed.]
                gtkagg: no  [Requires pygtk]
                 tkagg: yes [installing; run-time loading from Python Tcl /
                        Tk]
                 wxagg: no  [requires wxPython]
                   gtk: no  [Requires pygtk]
                   agg: yes [installing]
                 cairo: no  [cairocffi or pycairo not found]
             windowing: no  [Microsoft Windows only]

OPTIONAL LATEX DEPENDENCIES
                dvipng: no
           ghostscript: no
                 latex: yes [version 3.14159265]
               pdftops: no

OPTIONAL PACKAGE DATA
                  dlls: no  [skipping due to configuration]

============================================================================
                        * The following required packages can not be built:
                        * freetype, png * Try installing freetype with `apt-
                        * get install libfreetype6-dev` * Try installing png
                        * with `apt-get install libpng12-dev`
----------------------------------------

ERROR: Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-tvjuo8p6/matplotlib/

"GLIBC_2.14" error

I get "GLIBC_2.14" error when I run the below command. Is there a way to solve this without root access - in my conda env

my os:
Distributor ID: CentOS
Description: CentOS release 6.6 (Final)
Release: 6.6

from interpret.glassbox import ExplainableBoostingClassifier, LogisticRegression, ClassificationTree, DecisionListClassifier

ebm = ExplainableBoostingClassifier(random_state=seed)
ebm.fit(X_train, y_train)

/lib64/libc.so.6: version `GLIBC_2.14' not found


OSError Traceback (most recent call last)
in
----> 1 from interpret.glassbox import ExplainableBoostingClassifier, LogisticRegression, ClassificationTree, DecisionListClassifier
2
3 ebm = ExplainableBoostingClassifier(random_state=seed)
4 ebm.fit(X_train, y_train) #Works on dataframes and numpy arrays

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/glassbox/init.py in
5 from .linear import LogisticRegression, LinearRegression # noqa: F401
6 from .skoperules import DecisionListClassifier # noqa: F401
----> 7 from .ebm.ebm import ExplainableBoostingClassifier # noqa: F401
8 from .ebm.ebm import ExplainableBoostingRegressor # noqa: F401

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/glassbox/ebm/ebm.py in
5 from ...utils import perf_dict
6 from .utils import EBMUtils
----> 7 from .internal import NativeEBM
8 from ...utils import unify_data, autogen_schema
9 from ...api.base import ExplainerMixin

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/glassbox/ebm/internal.py in
63
64
---> 65 Lib = load_library(debug=False)
66
67 # C-level interface

~/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/glassbox/ebm/internal.py in load_library(debug)
59 is_debug = debug
60
---> 61 lib = ct.cdll.LoadLibrary(get_ebm_lib_path(debug=is_debug))
62 return lib
63

~/anaconda/envs/fastai/lib/python3.7/ctypes/init.py in LoadLibrary(self, name)
432
433 def LoadLibrary(self, name):
--> 434 return self._dlltype(name)
435
436 cdll = LibraryLoader(CDLL)

~/anaconda/envs/fastai/lib/python3.7/ctypes/init.py in init(self, name, mode, handle, use_errno, use_last_error)
354
355 if handle is None:
--> 356 self._handle = _dlopen(self._name, mode)
357 else:
358 self._handle = handle

OSError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/narjunan/anaconda/envs/fastai/lib/python3.7/site-packages/interpret/glassbox/ebm/../../lib/ebmcore_linux_x64.so)

Run by jupyter

It's not as issue but a note , hoping helpful for newegg :)

I run this on Win10 + Anaconda-Jupyter + Python3.7

Some trouble:
Cannot search & install interpret in Anaconda, so I install by pip, and include the lib in code


import sys
sys.path.append("D:\software\python\python37\Lib\site-packages") # my python3.7 location

import numpy as np
x_train = np.random.random((100, 20))
y_train = np.random.randint(2, size=(100, 1))

from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(x_train, y_train)
ebm_global = ebm.explain_global()

from interpret import show
from interpret import set_show_addr, get_show_addr
set_show_addr(('127.0.0.1', 7001)) # Will run on 127.0.0.1 at port 7001
show(ebm_global) 

I got result in Jupyter, while make a try visit by brower with "http://127.0.0.1:7001/", page show me "Internal Server Error", and Jupyter show tips ( not a problem but just wondering what happened):

Traceback (most recent call last):
File "D:\hw\software\Anaconda3\lib\site-packages\gevent\pywsgi.py", line 976, in handle_one_response
self.run_application()
File "D:\hw\software\Anaconda3\lib\site-packages\gevent\pywsgi.py", line 923, in run_application
self.result = self.application(self.environ, self.start_response)
File "D:\software\python\python37\Lib\site-packages\interpret\visual\dashboard.py", line 187, in call
app = self.pool[ctx_id]
KeyError: 'favicon.ico'
2019-05-20T10:43:12Z {'REMOTE_ADDR': '127.0.0.1', 'REMOTE_PORT': '59185', 'HTTP_HOST': '127.0.0.1:7001', (hidden keys: 23)} failed with KeyError

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.