pair-code / what-if-tool Goto Github PK
View Code? Open in Web Editor NEWSource code/webpage/demos for the What-If Tool
Home Page: https://pair-code.github.io/what-if-tool
License: Apache License 2.0
Source code/webpage/demos for the What-If Tool
Home Page: https://pair-code.github.io/what-if-tool
License: Apache License 2.0
Hi,
I was trying to run the toy CelebA model using TF Serving and I couldn't connect to the model.
Here is the command:
docker run -p 8500:8500 --mount type=bind,source='/Users/user/Downloads/wit_testing',target=/models/wit_testing -e MODEL_NAME=wit_testing -it tensorflow/serving
I ran TensorBoard on localhost:6006 and then I configured the WIT tool as follows:
I serialized the model to a ProtoBuf as instructed, serialized data to a TFRecords file but:
The photos on the right do not display in the same way they are displayed in the Web Demo, there are just dots representing datapoints.
Whenever I try to run an inference call to the model, I get bad request error (500). The error I am getting:
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Expects arg[0] to be float but string is provided"
debug_error_string = "{"created":"@1560888632.009402000","description":"Error received from peer ipv6:[::1]:8500","file":"src/core/lib/surface/call.cc","file_line":1041,"grpc_message":"Expects arg[0] to be float but string is provided","grpc_status":3}"
Is there some convention in naming and paths here that I am missing? Or am I doing something else completely wrong?
I worked with the Notebook mode and was successfully able to project attribution scores (using Shapely algorithm) on WIT dashboard. Due to a bigger data size, I then tried the visualization in the Tensorboard mode. The instructions given on the documentation page, mentions only two requirements: 1. ML model in TF serving format and, 2. TFRecord file of the example data set. There isn't any mention of generating or uploading attribution values (generated by Google Integrated Gradient or SHAP) in the Tensorboard mode. Please suggest if it's possible to add attribution values in Tensorboard mode or am I missing anything.
Age Demo from WIT [From here](: https://colab.research.google.com/github/pair-code/what-if-tool/blob/master/WIT_Age_Regression.ipynb)
current behavior:
On executing the command- WitWidget(config_builder, height=tool_height_in_px)
I encountered the below error above the WIT extension dashboard:
`Cannot set tensorflow.serving.Regression.value to array([21.733267], dtype=float32): array([21.733267], dtype=float32) has type <class 'numpy.ndarray'>, but expected one of: numbers.Real`
Problem: Performance tab has no output in WIT dashboard
Browser: Chrome
Expected Output:
Performance tab: Should work with all features of graphs and plots
Please see what is the issue. I have tried to debug the WitWidget function but unable to overcome this error.
Thanks.
Would be nice to have a mode where one can click on a feature in the datapoint editor and edit it, and have that edit take affect for ALL datapoints, not just a single one.
Need to think about proper UI for that experience.
Hi, I encounter an error while executing jupyter nbextension install --py --symlink --sys-prefix witwidget
:
Traceback (most recent call last): File "/usr/bin/jupyter-nbextension", line 11, in <module> load_entry_point('notebook==5.2.2', 'console_scripts', 'jupyter-nbextension')() File "/usr/lib/python3/dist-packages/jupyter_core/application.py", line 266, in launch_instance return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs) File "/usr/lib/python3/dist-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 988, in start super(NBExtensionApp, self).start() File "/usr/lib/python3/dist-packages/jupyter_core/application.py", line 255, in start self.subapp.start() File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 716, in start self.install_extensions() File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 695, in install_extensions **kwargs File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 211, in install_nbextension_python m, nbexts = _get_nbextension_metadata(module) File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 1122, in _get_nbextension_metadata m = import_item(module) File "/usr/lib/python3/dist-packages/traitlets/utils/importstring.py", line 42, in import_item return __import__(parts[0]) File "/home/linin/.local/lib/python3.6/site-packages/witwidget/__init__.py", line 15, in <module> from witwidget.notebook.visualization import * File "/home/linin/.local/lib/python3.6/site-packages/witwidget/notebook/visualization.py", line 27, in <module> from witwidget.notebook.jupyter.wit import * # pylint: disable=wildcard-import,g-import-not-at-top File "/home/linin/.local/lib/python3.6/site-packages/witwidget/notebook/jupyter/wit.py", line 25, in <module> from witwidget.notebook import base File "/home/linin/.local/lib/python3.6/site-packages/witwidget/notebook/base.py", line 26, in <module> from six import ensure_str ImportError: cannot import name 'ensure_str'
But I can import ensure_str within my python2 and python3, where could it go wrong?
Thanks a lot.
In the partial dependence plot view, there is a sort by variation button to sort features by how much their partial dependence plots vary (total Y axis distance traveled across the chart). Also, if comparing two models, each feature is ranked by its largest Y axis distance traveled across the two models' PD plots for that feature.
This information should be displayed in a information popup next to the sort button, like we have with other non-obvious buttons/controls.
I am working on several different tools that use the What-If tool for interactive visuals. It seems like I keep running into the same error where I see 'Loading Widget...' and no visual.
Currently, I am working in the JupyterLab Environment of Google Cloud's AI Platform.
I have did the following installations:
pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension
conda install -c conda-forge nodejs
jupyter nbextension install --py --user witwidget
jupyter nbextension enable witwidget --user --py
jupyter labextension install @jupyter-widgets/jupyterlab-manager
And have the following jupyterlab list:
JupyterLab v1.2.16
Known labextensions:
app dir: /opt/conda/share/jupyter/lab
@jupyter-widgets/jupyterlab-manager v1.1.0 enabled OK
@jupyterlab/celltags v0.2.0 enabled OK
@jupyterlab/git v0.10.1 enabled OK
js v0.1.0 enabled OK
jupyter-matplotlib v0.7.2 enabled OK
jupyterlab-plotly v4.8.1 enabled OK
nbdime-jupyterlab v1.0.0 enabled OK
plotlywidget v4.8.1 enabled OK
wit-widget v1.6.0 enabled OK
It would be helpful if you had any insight on why this problem might be occuring!
Hi James/Team,
Can i have python file (.piynb) for Multiclass classification as m working on the same case.
Thanks in advance !!
The witwidget does not seem to work in Jupyter Lab 2.x versions. Even before running witWidget, I get the error:
Whenever I use witwidget with Jupyter Lab 2.x, I get the error:
But when I use Jupyterlab 1.x, everything works fine. I guess the witwidget is not ported to Jupyterlab 2.x. Jupyterlab docs contain extension migration guide, which may help in updating the widget to JupyterLab 2.x.
Could you start tagging the releases here on Github?
Add a control for users to send feedback about the tool to the WIT team.
Investigate how to best do this. Could just be a bug/feedback button that links to a new github issue as the simplest approach.
hi, thanks for sharing all your awesome work! 👍
I was exploring the UCI dataset on the web demo while reading the paper and it looks to me like there might be a bug in how the UI state updates to color which elements of the counterfactual are different. Alternately, it might be I'm just misunderstanding the UX :)
I'm expecting that when I look at a data point, the attributes of the counterfactual that are different will be shown in green, like the "occupation" and "relationship" values here:
To reproduce:
Enable showing counterfactuals
Notice that "occupation" and "relationship" are highlighted in green, which is in line with what I'd expect since they're different:
Click on the highest "<50k" data point, colored blue and highlighted here:
Check out the counterfactual
It looks like some attributes that are the same are highlighted in green, which is not what I would expect. In this screenshot, I'd expect "occupation" to be green but "relationship" to be standard black text.
Note that the highlighting behavior is different if you clear the selection, and then click on the data point in step 4 directly. That shows these attributes highlighted, as I'd expected:
So this might be a UX misunderstanding, and maybe I'm not understanding how the counterfactual computation is supposed to interact with the selection. But since the behavior is different depending on the order of doing this, I suspect it's a UI bug with updating in response to state changes. I poked around a bit and seemed like maybe around here is where the syncing between selection interactions, changing the values, and rendering the color here.
Thanks! Let me know if there's anything else I can provide that's helpful for debugging.
As per package.json, this project is using bazel 0.23.2.
Line 35 in a4ada74
However, in WORKSPACE file, it requires bazel 0.26.1.
Line 19 in a4ada74
I tried yarn add @bazel/[email protected]
, the build can start but always fails at some bazel rules package, like error loading package 'node_modules/@schematics/update/node_modules/rxjs/src/operators': Unable to find package for @build_bazel_rules_typescript//:defs.bzl: The repository '@build_bazel_rules_typescript' could not be resolved.
or ERROR: error loading package '': in .../org_tensorflow_tensorboard/third_party/workspace.bzl: in .../npm_bazel_typescript/index.bzl: in .../npm_bazel_typescript/internal/ts_repositories.bzl: Unable to load file '@build_bazel_rules_nodejs//:index.bzl': file doesn't exist
(when I tried to upgrade @bazel/typescript package to latest).
What is the correct versions of bazel, bazel rules, etc., to use?
I can successfully run the demo script COMPAS Recidivism Classifier
in the jupyter notebook on my local machine. But nothing shows up at the end of jupyter notebook. I assume it will show up the visualization interface after I run WitWidget(config_builder, height=tool_height_in_px)
.
And I have installed all the extension at the beginning of the jupyter notebook(python3).
! jupyter nbextension install --py --symlink --sys-prefix witwidget
! jupyter nbextension enable --py --sys-prefix witwidget
When a TF serving model returns a failure instead of a prediction, pass the failure string to the front-end for display instead of the generic http error we see now.
Similar to issue #37 I would like to use the WIT with a TFX pipeline. I am trying this out with the Iris/Native Keras example from TFX (https://github.com/tensorflow/tfx/blob/master/tfx/examples/iris/iris_pipeline_native_keras.py). I have tried both set_custom_predict_fn and set_estimator_and_feature_spec. Both allow me to load the WIT but the Predict button cannot be used. In the set_custom_predict_fn case, the WIT gives me the error "AttributeError("'list' object has no attribute 'SerializeToString'",)". In the set_estimator_and_feature_spec case, the WIT gives me the error "AttributeError("'str' object has no attribute 'predict'",)".
Here's the code in a Colab: https://colab.research.google.com/drive/1tfUZ4MLT2Ynj8LNeghOUnL7iBstEQTgv
Which is the correct way to use the WitConfigBuilder with a TFX model, and how do I correct the error?
Thanks!
Hi There,
I am new to using the what-if tool. I would like to use it to see if my ML model is fair or not; I already have a trained XGBoost model (booster object, saved model). How can I use this model with the what-if tool and is this even possible? I notice in your code that can be modified by users the WIT-from scratch.ipynb that you use classifier = tf.estimator.LinearClassifier
What if I already have a saved model that is of the form I mentioned above (xgboost)? Can I still use the what-if tool or not. I am concerned that tensorflow does not support these types of booster object type models. Any help would be appreciated!!!! Thanks!
Hi James and Team,
config_builder = (WitConfigBuilder(test_examples.tolist(), X_test.columns.tolist() + ["total_time"])
.set_custom_predict_fn(adjust_prediction)
.set_target_feature('total_time')
.set_model_type('regression'))
WitWidget(config_builder, height=800)
Can we return html or something else from above code, which we can render on frontend as I want to use this in one of my applications.
Just like in plotly we do...in plotly we can return the HTML or we can open the plot in new Web page.
For both classification and regression models, we now have a global attributions table in the performance tab, if the model returns attributions along with predictions.
We need to appropriately style and position this table for both model types and for one or two models.
it says party :)
I'm trying to add some functionality my own in the what-if-tool dashboard. I follow the https://github.com/PAIR-code/what-if-tool/blob/master/DEVELOPMENT.md to setup and rebuild the package.
Because I just use jupyter notebook to play around the what-if-tool, I just follow
a. rm -rf /tmp/wit-pip (if it already exists)
b. bazel run witwidget/pip_package:build_pip_package
c. Install the package
For use in Jupyter notebooks, install and enable the locally-build pip package per instructions in the README, but instead use pip install <pathToBuiltPipPackageWhlFile>, then launch the jupyter notebook kernel.
But after I run b
step, I don't know where's the <pathToBuiltPipPackageWhlFile>
. I can't find any files end with .whl
in the folder. Maybe I missed something. Appreciate any help.
And there is a WARNING when I run b
step. Not sure if it causes the problem
WARNING: Download from https://mirror.bazel.build/repo1.maven.org/maven2/com/google/javascript/closure-compiler-unshaded/v20190909/closure-compiler-unshaded-v20190909.jar failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
For classification models, we display a PR curve in the performance tab. We should calculate the area under this curve and display it above the curve, in the title for the chart.
When comparing two models, we could calculate rank correlation (at least for binary classification and regression models). Rank correlation is a number indicating how much the scores from the two models across the test examples line up in terms of order of the scores across those examples.
Need to think about where this info would go though. Would be valuable to calculate on slices as well, when user is slicing in performance tab.
I have read the two repo code, but I found that they are not exactly the same, so what's the difference between them. Like which one will updated frequently and whick one will have a newer version and etc...
I'm representing TF SIG Build, a TensorFlow special interest group dedicated to building and testing TF in open source. Our last meeting surfaced confusion from community members involved in packaging TF for other environments (e.g. Gentoo, Anaconda) about tensorboard-plugin-wit
, which I think could be resolved with these two asks:
tensorflow
-> tensorboard
-> tensorflow-plugin-wit
dependency, which points to an empty PyPI page.
tensorboard
depend on it? (e.g. "it was once part of core tensorboard but was moved to a plugin")1.6.0post*
patch releases lack a matching tag in this repo. For packagers, a tag for each release means they can rebuild the package in the necessary configuration for their platform, and helps verify that the package on PyPI really matches up with the code.These would help a lot!
hello! When doing development, what's a good workflow? I'll share what I discovered and hope that this helps other folks new to the repo, or perhaps they can help me understand better ways to approach this. The TDLR is compilation_level = "BUNDLE"
seems useful for development :)
I started with the web demo for the smiling classifier, and it seems the way the build works, changes to the filesystem aren't detected and you have to kill the process and then run the full vulcanize process on each change. This takes about a minute on my laptop, so that's what prompted me to look into this.
In the Bazel output I see:
$ bazel run wit_dashboard/demo:imagedemoserver
INFO: Analyzed target //wit_dashboard/demo:imagedemoserver (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
INFO: From Vulcanizing /wit-dashboard/image_index.html:
...
The vulcanizing step takes about a full minute just to rebuild when changing just the text in line of logging. To understand what it's doing, I read through the BUILD files and the related .bzl
definitions over in TensorBoard. I noticed that there are references to other libraries in https://github.com/PAIR-code/what-if-tool/blob/master/wit_dashboard/demo/wit-image-demo.html#L19, and figured maybe that's why the build is taking so long.
Removing those cut the build time down to ~45 seconds.
Removing wit-dashboard
cuts the build down to ~15 seconds.
Removing everything but just the Polymer bits brings the build build to ~2 seconds.
Stepping back, I see that the wit-dashboard
includes dependencies in the BUILD file from Polymer, Facets, and TensorBoard (as well as WIT components). If I comment out the WIT dependencies from the BUILD file and from the <link />
tags in wit-dashboard.html
, this still takes ~40 seconds to build. So it seems to me like most of the build time, even on just changing text in a console.log statement is from either re-compiling dependencies, or from whole-program optimization the Vulcanize task is doing (or maybe that Closure compiler is doing on its behalf).
I tried copying the vulcanize.bzl
from TensorBoard into the WIT folder so I could look at it and understand what it's doing. But in the process, I noticed some params in the BUILD task that ultimately does the vulcanizing:
tensorboard_html_binary(
name = "imagedemoserver",
testonly = True, # Keeps JavaScript somewhat readable
compile = True, # Run Closure Compiler
input_path = "/wit-dashboard/image_index.html",
output_path = "/wit-dashboard/image_demo.html",
deps = [":imagedemo"],
)
Changing compile = False
cuts the build to 2 seconds! But it doesn't work because somewhere in the project there are goog.require
style dependencies.
Changing the compilation_level
helps though! I found these options in the Closure compiler, and luckily the build task in TensorBoard that calls Closure passes those right along. This gets things working again and down to ~20 seconds. The Closure Bazel defs say to use WHITESPACE_ONLY but that it will disable type checking (https://github.com/bazelbuild/rules_closure/blob/4925e6228e89e3b051a89ff57b8a033fa3fb9544/README.md#arguments-2). This helps (~10 seconds) but breaks the app. The Closure docs don't mention BUNDLE but you can see it in the source:
public enum CompilationLevel {
/** BUNDLE Simply orders and concatenates files to the output. */
BUNDLE,
And using this takes like half the time to build as compared to SIMPLE_OPTIMIZATIONS.
In the end, this is the impact on my sweet 2012 MacBook Pro:
# master, after just changing a logging string
$ bazel run wit_dashboard/demo:imagedemoserver
INFO: Elapsed time: 53.611s, Critical Path: 52.66s
# set compilation_level = "BUNDLE" instead of default ("ADVANCED")
$ bazel run wit_dashboard/demo:imagedemoserver
INFO: Elapsed time: 17.940s, Critical Path: 17.45s
So, I'll do this locally now, but would also love to learn if there are better ways to do this :)
Alternately, I also poked around to see if there was a way to either update these calls to listen to a command line arg or env variable passed through bazel run
. I skimmed the Bazel docs and issues and saw things like aspects and bazelrc
but nothing seemed fast and direct. I suppose this could be done in TensorBoard in the tensorboard_html_binary
task. But I also discovered that there's a WORKSPACE
and workspace.bzl
tasks here, so maybe that could be a place to add a layer of indirection so that the project calls into tensorboard_html_binary_wrapper
that reads some env switch so it builds for production by default, but if you do bazel run whatev --be-faster
then it can run Closure compiler without the slower advanced optimizations. If doing something like that is helpful I can try but attempting changes to the build setup are always dicey :)
If that's too much of a pain I can just add a note to https://github.com/PAIR-code/what-if-tool/blob/master/DEVELOPMENT.md to help folks discover how to speed up local builds. Thanks!
EDIT: Also noticed that a long time ago @stephanwlee was thinking about this upstream tensorflow/tensorboard#1599 and some other open issues reference related things about advanced compilation mode in dev (eg, tensorflow/tensorboard#2687)
Hi James and Team,
Is it possible to use my own data but still have the same front end?
There is one point that has been gone that is: Group unaware and in the training they suggest to click. What happened? Thanks.
WhatIf currently only reads first 50 Datapoints to generate candidates for categorical features to be used in "Partial Dependence Plots". This could be too restrictive. It shall read more data to get a more complete list of categories and choose the most frequent ones for plots.
Following post is not exactly en error with WIT, but I'm having issues with the output from google explain which acts as input for WIT tool. Please help, if possible.
I have a 3d input keras model which trains successfully.
Model: "model"
Layer (type) Output Shape Param #
input_1 (InputLayer) [(None, 5, 1815)] 0
bidirectional (Bidirectional (None, 5, 64) 473088
bidirectional_1 (Bidirection (None, 5, 64) 24832
output (TimeDistributed) (None, 5, 25) 1625
Total params: 499,545
Trainable params: 499,545
Non-trainable params: 0
Post that estimator is defined and the serving is created as:
keras_estimator = tf.keras.estimator.model_to_estimator(keras_model=model, model_dir='export')
serving_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(
{'input_1': model.input}
)
export_path = keras_estimator.export_saved_model(
'gs://' + BUCKET_NAME + '/explanations',
serving_input_receiver_fn=serving_fn
).decode('utf-8')
print(export_path)
The explanation metadata definition is defined and copied to required destination as below:
explanation_metadata = {
"inputs": {
"data": {
"input_tensor_name": "input_1:0",
"input_baselines": [np.mean(data_X, axis=0).tolist()],
"encoding": "bag_of_features",
"index_feature_mapping": feature_X.tolist()
}
},
"outputs": {
"duration": {
"output_tensor_name": "output/Reshape_1:0"
}
},
"framework": "tensorflow"
}
with open('explanation_metadata.json', 'w') as output_file:
json.dump(explanation_metadata, output_file)
!gsutil cp explanation_metadata.json $export_path
Post that the model is created and the version is defined as:
!gcloud ai-platform models create $MODEL --enable-logging --regions=us-central1
explain_method = 'integrated-gradients'
!gcloud beta ai-platform versions create $VERSION
--model $MODEL
--origin $export_path
--runtime-version 1.15
--framework TENSORFLOW
--python-version 3.7
--machine-type n1-standard-4
--explanation-method $explain_method
--num-integral-steps 25
Everything works fine until this step, but now when I create and send the explain request as:
prediction_json = {'input_1': data_X[:5].tolist()}
with open('diag-data.json', 'w') as outfile:
json.dump(prediction_json, outfile)
!gcloud beta ai-platform explain --model $MODEL --json-instances='diag-data.json'
I get the following error
{
"error": "Explainability failed with exception: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INVALID_ARGUMENT\n\tdetails = "transpose expects a vector of size 4. But input(1) is a vector of size 3\n\t [[{{node bidirectional/forward_lstm_1/transpose}}]]"\n\tdebug_error_string = "{"created":"@1586068796.692241013","description":"Error received from peer ipv4:10.7.252.78:8500","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"transpose expects a vector of size 4. But input(1) is a vector of size 3\n\t [[{{node bidirectional/forward_lstm_1/transpose}}]]","grpc_status":3}"\n>"
}
I tried altering the input shape, but nothing worked. Then to verify the format I tried with google cloud predict command which initially did not work, but worked after reshaping the input as:
prediction_json = {'input_1': data_X[:5].reshape(-1,1815).tolist()}
with open('diag-data.json', 'w') as outfile:
json.dump(prediction_json, outfile)
!gcloud beta ai-platform predict --model $MODEL --json-instances='diag-data.json'
I'm at a dead end now with !gcloud beta ai-platform explain --model $MODEL --json-instances='diag-data.json'
and looking for the much needed help from SO community.
Also, for ease of experimenting, the notebook could be accessed from google_explain_notebook
Thanks for sharing! This is awesome, and super cool to see tools that let people now do explorations like in https://research.google.com/bigpicture/attacking-discrimination-in-ml/ with their own models or plain CSVs :)
In reading the UI, and in talking through this with other people about what's going on in the fairness optimizations, I found myself marking up screenshots to explain what was going on, like this:
I thought it might be a helpful improvement to make these connections more explicit and obvious, rather than having to parse the text definitions and map them to the UI and data points on the right. The screenshot above isn't a UI proposal, but I could sketch some options if you're interested in brainstorming. It's particularly hard to see what's being compared when you slice by more than one dimension and the confusion matrix isn't visible, so would be interesting in seeing if there's ways to make it possible to see this across say four slices. If there are other ways to look at this, that'd be awesome to learn about too! There's a lot of information and conceptual density here, so laying it out and staying simple seems like a great design challenge but also super hard :)
Relatedly, if I'm understanding right, for some of the choices the actual metric being optimized isn't visible anywhere at all (putting aside the cost ratio altogether for now). So for single threshold, as an example, I believe the number being optimized is the overall accuracy, the aggregation of these two numbers weighted by the count of examples:
So in this case I'm trying to see how much the overall accuracy goes down when trying different optimization strategies that will all bring it down as they trade off other goals (eg, equal opportunity). These questions may just from me exploring the impact of different parameters to build intuition, but the kind of question I'm trying to ask is "how much worse is the metric that equal opportunity is optimizing for overall, when I choose demographic parity?" and "how much worse is the metric for equal opportunity for each slice when I choose demographic parity?" Not sure if I'm making any sense, but essentially trying to compare how one optimization choice impacts the other optimization choices' metrics.
Thanks!
WhatIf currently only reads first 50 Datapoints to generate candidates for categorical features to be used in "Partial Dependence Plots". This could be too restrictive. It shall read more data to get a more complete list of categories and choose the most frequent ones for plots.
Hi,
we really like to use the What-If tool. The last days we encountered that the split of the data between the datapoint editor and the performance and fairness tabs isn't performed in the same way. As an example, we binned the data of the UCI census income dataset by age in 10 bins. The number of data points in each bin for the datapoint editor and performance & fairness tabs can vary (s. figure).
For us, it would be extremely helpful if the data in e.g. the first bin of the datapoint editor would be exactly the same as in the first bin of the performance and fairness tab.
Best,
Timo
In the notebook https://colab.research.google.com/github/pair-code/what-if-tool/blob/master/WIT_COMPAS.ipynb#scrollTo=VZ-rK11X5arK
The 8th cell mentions:
"But, the FP rate is MUCH higher for African Americans and the FN rate is MUCH lower for caucasians"
Even though the data shows :
AA FN: 15.4%
Caucasion FN: 25.2%
AA FP: 19.5%
Caucasion FP: 9.7%
Is this a typo or are we missing something here?
Hey there,
I've used the WIT in the past and am now coming back to it for a new project. I'm trying to use the Jupyter integrated widget with the inference done through a
tensorflow/serving
docker container (2.1.0)tf.model.save
I'm getting a fairly unhelpful error whenever I try to run inference (when it starts up and when I click that Infer button):
TypeError('None has type NoneType, but expected one of: bytes, unicode',)
The configuration I've setup is as follows
wit_config = (
witwidget.WitConfigBuilder(pred_df_ex)
.set_inference_address('<host_redacted>:8500')
.set_model_name('fts_test')
.set_uses_predict_api(True)
.set_predict_output_tensor('outputs')
.set_model_type('classification')
#.set_predict_input_tensor('inputs')
.set_target_feature('label')
.set_label_vocab(['No','Yes'])
)
I'm able to rule out that it's not getting a response from the server, since if I mess with the configuration to make it intentionally broken (e.g. if I uncomment that input tensor line) I'll get an error that probably could only come from the server.
<_Rendezvous of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "input size does not match signature: 1!=11 len({inputs}) !=
len({<feature_names_redacted>}). Sent extra: {inputs}. Missing but required: {<feature_names_redacted>}." debug_error_string =
"{"created":"@1585001946.586897426","description":"Error received from peer ipv4:<ipaddr_redacted>:8500","file":"src/core/lib/surface
/call.cc","file_line":1052,"grpc_message":"input size does not match signature: 1!=11
len({inputs}) != len({<feature_names_redacted>}). Sent extra: {inputs}. Missing but required: {<feature_names_redacted>}.","grpc_status":3}" >
I've verified that I can talk to the host no problem from this machine as well.
I'm not sure, honestly, whether or not that's an accurate assessment however -- what I'd need is the full stacktrace of that error. Any help would be appreciated.
I started from the demo income classification , using the linear regressor - I am using my own data set to predict insurance payments using about 12 features. My dataset has over 30,000 points, the what-if tool disconnect when I set the number of data points to 20000. whatif works up to about 15,000 numdata_points.
This is not a time out problem, code runs for under 5 minutes. Is there a limit on the number of data points that WitConfigBuilder can handle in colab.
response would be much appreciated.
This is an awesome demo! I spent some time exploring the performance & fairness tabs, and then digging into facets dive, and individual examples. It's really interesting, thanks for putting together such an accessible demo 👍
In the process I found a bunch of data points that were labeled differently than I would have expected. I figured this was errors from the labeling that were part of the CelebA set, but it came up enough times that it led to me investigate further and try to see how many data points appeared to be mislabeled, when compared to my own personal oracle-like labeling truth :)
To debug, I downloaded the CelebA dataset and then started looking at individual data points, assuming the Datapoint ID
in WIT would correspond to the image ID and filename in CelebA. This doesn't seem to be the case though, and so I can't figure out how to verify further.
You can pick any number as an example (Datapoint ID 1 is what I started with when trying to compare to CelebA). But for one full example, in facets within WIT, I noticed a datapoint was labeled "Sideburns" in a way I didn't expect looking at the image myself, so I clicked in to see the image and the datapoint ID:
But then checking in the the CelebA set, this is what I see for image 000038.jpg
:
The data in CelebA for 38 in the list_attr_celeba.csv
file is also different than the data in WIT for Datapoint ID 38:
Since there's only 250 examples in the WIT tool, I'm wondering if the full dataset is being sampled for the demo, and then the Datapoint ID values are being mapped to 0-249 and the reference back to the original dataset is lost? That's just a guess though. I see there's a bunch of data in https://github.com/PAIR-code/what-if-tool/tree/master/data/images but not sure how to debug further.
Thanks for sharing this work!
Good morning, Thank you so much for this amazing piece of work. When I saw the presentation I immediately thought of our leaders and how might they benefit from the understanding that this process brings. Due to the nature of the data that we are working with here at the Ministry of Education we would never be able to use this tool effectively because of the breach in confidentiality. Is there a way to use this tool on the local machine and have it display the results just as it would have done in the cloud?
Thank you again for your time and suppor.
Best Regards,
Andrei.
we are not being able to convert our custom dataset into correct format that what if tool requires.
Sample data that we are trying to convert is below.
"ID","age","workclass","fnlwgt","education","education-num","marital-status","occupation","relationship","race","sex","capital-gain","capital-loss","hours-per-week","native-country","result"
19122,42," Federal-gov",178470," HS-grad",9," Divorced"," Adm-clerical"," Not-in-family"," White"," Female",0,0,40," United-States"," <=50K"
20798,49," Federal-gov",115784," Some-college",10," Married-civ-spouse"," Craft-repair"," Husband"," White"," Male",0,0,40," United-States"," <=50K"
32472,34," Private",30673," Masters",14," Married-civ-spouse"," Prof-specialty"," Husband"," White"," Male",0,0,55," United-States"," <=50K"
21476,29," Private",157612," Bachelors",13," Never-married"," Prof-specialty"," Not-in-family"," White"," Female",14344,0,40," United-States"," >50K"
24836,30," Private",175931," HS-grad",9," Married-civ-spouse"," Craft-repair"," Husband"," White"," Male",0,0,40," United-States"," <=50K"
5285,31," Self-emp-inc",236415," Some-college",10," Married-civ-spouse"," Adm-clerical"," Wife"," White"," Female",0,0,20," United-States"," >50K"
This is the json format that i currently have.
[{ "ID": 19122, "age": 42, "capital-gain": 0, "capital-loss": 0, "education": " HS-grad", "education-num": 9, "fnlwgt": 178470, "hours-per-week": 40, "marital-status": " Divorced", "native-country": " United-States", "occupation": " Adm-clerical", "race": " White", "relationship": " Not-in-family", "result": " <=50K", "sex": " Female", "workclass": " Federal-gov" }]
what do we need to do to generate the json in the format that is required by the what if tool.
Hi,
I am new using tensorflow and WIT and I do not even know if Ishould be posting this here but I am trying to replicate the COMPAS demo using TensorFlow Serving on Docker and I get the next error:
Request for model inference failed: RequestNetworkError: RequestNetworkError: 500 at /data/plugin/whatif/infer?inference_address.
I am using the following docker command:
docker run -p 8500:8500 --mount type=bind,source="C:\Users\arancha.abad\Importar_modelos\versiones",target=/models/saved_model -e MODEL_NAME=saved_model -t tensorflow/serving
and it seems to work properly, but when I open WIT on TensorBoard, the only thing I can see is everything related to the .tfrecord file. I can see the datapoints and edit them and I can also go to Features and see every histogram but I can't run an infer and when WIT is opened, the error described abobe is displayed.
I am using tensorflow 2.2 (rc) and tensorboard 2.1.1 and this is the way I export the COMPAS model:
serving_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec) export_path = classifier.export_saved_model(export_path, serving_input_fn)
I get the saved_model.pb and the variable folder. If I use saved_model_cli to show the model i get what follows:
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['classification']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 2)
name: head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/classify
signature_def['predict']:
The given SavedModel SignatureDef contains the following input(s):
inputs['examples'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['all_class_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 2)
name: head/predictions/Tile:0
outputs['all_classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 2)
name: head/predictions/Tile_1:0
outputs['class_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: head/predictions/ExpandDims:0
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 1)
name: head/predictions/str_classes:0
outputs['logistic'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: head/predictions/logistic:0
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: linear/linear_model/linear/linear_model/linear/linear_model/weighted_sum:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/predict
signature_def['regression']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['outputs'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: head/predictions/logistic:0
Method name is: tensorflow/serving/regress
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 2)
name: head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/classify
The model has the signatures: predict, classification, regression and serving_default so everything seems to be fine. Right now I dont know what else should I do to make it work, maybe my mistake is the way I create the serving_input_fn or anyother thing so any help would be appreciated.
Thank you for your help!
I tried to use dill
to save the result witwidget template but it doesn't work. Is there anyway to save the result and load it back later for further analysis? Thanks
How can I create a Web page, similar to the UCI Census demo, with a custom and predictions (without specifying TF model)?
I've looked into the wit_dashboard, but it is not clear to me what should I modify and how should I specify my data.
https://colab.research.google.com/github/pair-code/what-if-tool/blob/master/WIT_COMPAS.ipynb
In this notebook, running the "Invoke What-If Tool for test data and the trained models" cell gives the following.
MessageErrorTraceback (most recent call last)
<ipython-input-8-18dbcd24366f> in <module>()
10 config_builder = WitConfigBuilder(examples[0:num_datapoints]).set_estimator_and_feature_spec(
11 classifier, feature_spec)
---> 12 WitWidget(config_builder, height=tool_height_in_px)
3 frames
/usr/local/lib/python2.7/dist-packages/witwidget/notebook/colab/wit.pyc in __init__(self, config_builder, height, delay_rendering)
238
239 if not delay_rendering:
--> 240 self.render()
241
242 # Increment the static instance WitWidget index counter
/usr/local/lib/python2.7/dist-packages/witwidget/notebook/colab/wit.pyc in render(self)
252 # Send the provided config and examples to JS
253 output.eval_js("""configCallback({config})""".format(
--> 254 config=json.dumps(self.config)))
255 output.eval_js("""updateExamplesCallback({examples})""".format(
256 examples=json.dumps(self.examples)))
/usr/local/lib/python2.7/dist-packages/google/colab/output/_js.pyc in eval_js(script, ignore_result)
37 if ignore_result:
38 return
---> 39 return _message.read_reply_from_input(request_id)
40
41
/usr/local/lib/python2.7/dist-packages/google/colab/_message.pyc in read_reply_from_input(message_id, timeout_sec)
104 reply.get('colab_msg_id') == message_id):
105 if 'error' in reply:
--> 106 raise MessageError(reply['error'])
107 return reply.get('data', None)
108
MessageError: ReferenceError: configCallback is not defined
Currently WIT sends all features to the front-end for all examples. If the examples contain image features, this means we can't load a ton of examples for that model.
Instead for large features like images, don't send them to the front-end immediately, only send the image feature to the front-end when clicking on an example to view in the datapoint editor.
If I can, is there any instruction? I did't found anything about this in the repo
Thanks for sharing this awesome work! 👍
This is what the console outputs:
So I'm guessing something about the Bazel config, and that it needs to include a polyfill or some other way to load Polymer code. From poking around a bit I think it's in the config of this build command (to use the smiling dataset as an example): https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/interactive_inference/tf_interactive_inference_dashboard/demo/BUILD#L69
From searching around, I didn't find much info building polymer code with bazel outside the googleplex. And reading that build file, it looks like it's pulling in what I'd expect in https://github.com/tensorflow/tensorboard/blob/master/tensorboard/components/tf_imports/BUILD#L16. And it looks like it pulls in external polymer artifacts in https://github.com/tensorflow/tensorboard/blob/d3a6cfd6eb5c0fff4a405b23c5361875adf908f0/third_party/polymer.bzl#L1379. Those must be working right, since the facets demos work fine in other browsers, so I'm guessing it's something about the specific BUILD task within tf_interactive_inference_dashboard/demo/
but not sure what.
Thanks! :)
I am new to tensorflow and I'm quite confused about the format the wit is expecting from the tfrecords to feed the model
Basically I have a multi input keras model with the following signature:
The given SavedModel SignatureDef contains the following input(s):
inputs['byte_entropy'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 256)
name: serving_default_byte_entropy:0
inputs['data_directories'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 30)
name: serving_default_data_directories:0
inputs['exports'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 128)
name: serving_default_exports:0
inputs['general'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: serving_default_general:0
inputs['header'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 62)
name: serving_default_header:0
inputs['histogram'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 256)
name: serving_default_histogram:0
inputs['imports'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1280)
name: serving_default_imports:0
inputs['section'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 255)
name: serving_default_section:0
inputs['strings'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 104)
name: serving_default_strings:0
The given SavedModel SignatureDef contains the following output(s):
outputs['final_output'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
And some tfrecords (records of Examples) with the same features
feature={'histogram': FixedLenFeature(shape=[256], dtype=tf.float32, default_value=None),
'byte_entropy': FixedLenFeature(shape=[256], dtype=tf.float32, default_value=None),
'strings': FixedLenFeature(shape=[104], dtype=tf.float32, default_value=None),
'general': FixedLenFeature(shape=[10], dtype=tf.float32, default_value=None),
'header': FixedLenFeature(shape=[62], dtype=tf.float32, default_value=None),
'section': FixedLenFeature(shape=[255], dtype=tf.float32, default_value=None),
'imports': FixedLenFeature(shape=[1280], dtype=tf.float32, default_value=None),
'exports': FixedLenFeature(shape=[128], dtype=tf.float32, default_value=None),
'data_directories': FixedLenFeature(shape=[30], dtype=tf.float32, default_value=None),
'final_output': FixedLenFeature(shape=(), dtype=tf.float32, default_value=None)}
When trying to show it as a regression I get an invalid argument error
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "input size does not match signature: 1!=9 len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}) != len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}). Sent extra: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}. Missing but required: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}."
debug_error_string = "{"created":"@1588279350.370000000","description":"Error received from peer ipv6:[::1]:8500","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"input size does not match signature: 1!=9 len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}) != len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}). Sent extra: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}. Missing but required: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}.","grpc_status":3}"
Where or how should I specify which tfrecord goes to which input in the model?
And If you are kind do you know how may I alter the model prediction to make it suitable for classification? At the moment the prediction is a number from zero to one and to suit the wit it should be an array with two probabilities. I know this is not a question about wit especially
Hi
I trained a model with tfx and it was exported as saved_model.pb.
Now, I want to reload it and visualize it using WIT.
How can I do this?
I couldn't find a way to do it since when reloading the model:
imported = tf.saved_model.load(export_dir=trained_model_path)
I get object from the type :
<tensorflow.python.training.tracking.tracking.AutoTrackable at 0x7f3d71e456a0>
instead of an estimator.
Thanks
In performance tab, would be nice to have a button to (in the background) calculate slices with the largest performance disparities and surface those to the user for them to explore.
Currently users have to check slices one by one to look at their performance disparities.
Need to do this in an efficient manner as with intersectional slices this becomes quadratic in scale.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.