Coder Social home page Coder Social logo

trusted-ai / aix360 Goto Github PK

View Code? Open in Web Editor NEW
1.6K 57.0 305.0 365.35 MB

Interpretability and explainability of data and machine learning models

Home Page: https://aix360.res.ibm.com/

License: Apache License 2.0

Python 99.74% R 0.22% Dockerfile 0.03% Shell 0.01%
explainable-ai explainable-ml trusted-ai trusted-ml machine-learning deep-learning codait artificial-intelligence explainabil xai

aix360's People

Contributors

animeshsingh avatar asm582 avatar cclauss avatar dennislwei avatar fabianlim avatar floidgilbert avatar gaborpelesz avatar gdequeiroz avatar gganapavarapu avatar imgbotapp avatar ishapuri avatar jamescodella avatar kant avatar karthikeyansh avatar kmyusk avatar marleen1 avatar michaelhind avatar monindersingh avatar pronics2004 avatar rahulnair23ibm avatar rluss avatar sadhamanus avatar swag2198 avatar tomcli avatar vijay-arya avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aix360's Issues

Metrics do not work with Keras models

Certain models, such as Keras sequential models, have predict methods that behave like predict_proba i.e. return predicted class probabilities rather than predicted class like predict method of scikit classifiers. Current metrics implementation use predict method to get predicted class which throws error for Keras models.

Protodash Timeseries

In the HELOC example you metioned that Protodash can be applied to time series, but how would one go about doing that, do you perhaps have an example or can you offer some advice?

conda-forge release?

Thanks for this. I'm looking forward to checking it out.

Is there a conda-forge release? If not, is there any plan for one?

BRCG train fails in copied "Credit Approval Tutorial" code

Hi there,

I've actually copied the code (did no modification at all) from the BRCG part of the "Credit Approval Tutorial" code and ran into errors. I'm quite sure that the dataset was loaded appropriately, as I have also trained a scikit learn Decision Tree Classifier on it with no problem and in the same notebook.

Can someone help me with this issue? Am I missing something or is it an internal problem?

Thanks in advance!

Here is the code and the output.
It was run on google colab, with pandas 1.1.2 and the latest aix360 release, which is 0.2.0.

Copied code

import warnings
warnings.filterwarnings('ignore')

# Load FICO HELOC data with special values converted to np.nan
from aix360.datasets.heloc_dataset import HELOCDataset, nan_preprocessing
data = HELOCDataset(custom_preprocessing=nan_preprocessing).data()
# Separate target variable
y = data.pop('RiskPerformance')

# Split data into training and test sets using fixed random seed
from sklearn.model_selection import train_test_split
dfTrain, dfTest, yTrain, yTest = train_test_split(data, y, random_state=0, stratify=y)
dfTrain.head().transpose()

# Binarize data and also return standardized ordinal features
from aix360.algorithms.rbm import FeatureBinarizer
fb = FeatureBinarizer(negations=True, returnOrd=True)
dfTrain, dfTrainStd = fb.fit_transform(dfTrain)
dfTest, dfTestStd = fb.transform(dfTest)
dfTrain['ExternalRiskEstimate'].head()

# Instantiate BRCG with small complexity penalty and large beam search width
from aix360.algorithms.rbm import BooleanRuleCG
br = BooleanRuleCG(lambda0=1e-3, lambda1=1e-3, CNF=True)

# Train, print, and evaluate model
br.fit(dfTrain, yTrain)
from sklearn.metrics import accuracy_score
print('Training accuracy:', accuracy_score(yTrain, br.predict(dfTrain)))
print('Test accuracy:', accuracy_score(yTest, br.predict(dfTest)))
print('Predict Y=0 if ANY of the following rules are satisfied, otherwise Y=1:')
print(br.explain()['rules'])

Output

Learning CNF rule with complexity parameters lambda0=0.001, lambda1=0.001
Initial LP solved
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in __setitem__(self, key, value)
   1001         try:
-> 1002             self._set_with_engine(key, value)
   1003         except (KeyError, ValueError):

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in _set_with_engine(self, key, value)
   1032         # fails with AttributeError for IntervalIndex
-> 1033         loc = self.index._engine.get_loc(key)
   1034         validate_numeric_casting(self.dtype, value)

pandas/_libs/index.pyx in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc()

KeyError: 'ExternalRiskEstimate'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-98-8d81fbd6c0e1> in <module>()
     26 
     27 # Train, print, and evaluate model
---> 28 br.fit(dfTrain, yTrain)
     29 from sklearn.metrics import accuracy_score
     30 print('Training accuracy:', accuracy_score(yTrain, br.predict(dfTrain)))

/usr/local/lib/python3.6/dist-packages/aix360/algorithms/rbm/boolean_rule_cg.py in fit(self, X, y)
    118         UB = min(UB.min(), 0)
    119         v, zNew, Anew = beam_search(r, X, self.lambda0, self.lambda1,
--> 120                                     K=self.K, UB=UB, D=self.D, B=self.B, eps=self.eps)
    121 
    122         while (v < -self.eps).any() and (self.it < self.iterMax):

/usr/local/lib/python3.6/dist-packages/aix360/algorithms/rbm/beam_search.py in beam_search(r, X, lambda0, lambda1, K, UB, D, B, wLB, eps, stopEarly)
    285             if i[1] == '<=':
    286                 thresh = Xp[i[0]].columns.get_level_values(1).to_series().replace('NaN', np.nan)
--> 287                 colKeep[i[0]] = (Xp[i[0]].columns.get_level_values(0) == '>') & (thresh < i[2])
    288             elif i[1] == '>':
    289                 thresh = Xp[i[0]].columns.get_level_values(1).to_series().replace('NaN', np.nan)

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in __setitem__(self, key, value)
   1008             else:
   1009                 # GH#12862 adding an new key to the Series
-> 1010                 self.loc[key] = value
   1011 
   1012         except TypeError as e:

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    668 
    669         iloc = self if self.name == "iloc" else self.obj.iloc
--> 670         iloc._setitem_with_indexer(indexer, value)
    671 
    672     def _validate_key(self, key, axis: int):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
   1790                 # setting for extensionarrays that store dicts. Need to decide
   1791                 # if it's worth supporting that.
-> 1792                 value = self._align_series(indexer, Series(value))
   1793 
   1794             elif isinstance(value, ABCDataFrame):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _align_series(self, indexer, ser, multiindex_indexer)
   1909             # series, so need to broadcast (see GH5206)
   1910             if sum_aligners == self.ndim and all(is_sequence(_) for _ in indexer):
-> 1911                 ser = ser.reindex(obj.axes[0][indexer[0]], copy=True)._values
   1912 
   1913                 # single indexer

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in reindex(self, index, **kwargs)
   4397     )
   4398     def reindex(self, index=None, **kwargs):
-> 4399         return super().reindex(index=index, **kwargs)
   4400 
   4401     def drop(

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in reindex(self, *args, **kwargs)
   4457         # perform the reindex on the axes
   4458         return self._reindex_axes(
-> 4459             axes, level, limit, tolerance, method, fill_value, copy
   4460         ).__finalize__(self, method="reindex")
   4461 

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
   4480                 fill_value=fill_value,
   4481                 copy=copy,
-> 4482                 allow_dups=False,
   4483             )
   4484 

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
   4525                 fill_value=fill_value,
   4526                 allow_dups=allow_dups,
-> 4527                 copy=copy,
   4528             )
   4529             # If we've made a copy once, no need to make another one

/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, consolidate)
   1274         # some axes don't allow reindexing with dups
   1275         if not allow_dups:
-> 1276             self.axes[axis]._can_reindex(indexer)
   1277 
   1278         if axis >= self.ndim:

/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
   3283         # trying to reindex on an axis with duplicates
   3284         if not self.is_unique and len(indexer):
-> 3285             raise ValueError("cannot reindex from a duplicate axis")
   3286 
   3287     def reindex(self, target, method=None, level=None, limit=None, tolerance=None):

ValueError: cannot reindex from a duplicate axis

CDC Tutorial - NotImplementedError: Can't copy SAS variable metadata to dataframe

When executing the second cell in the Health and Lifestyle Survey Questions Tutorial, on this line:

nhanes = CDCDataset()

I get this error:

Downloading file ACQ_H.XPT
Downloading file ALQ_H.XPT
Downloading file BPQ_H.XPT
Downloading file CDQ_H.XPT
Downloading file CFQ_H.XPT
Downloading file CBQ_H.XPT
Downloading file CKQ_H.XPT
Downloading file HSQ_H.XPT
Downloading file DEQ_H.XPT
Downloading file DIQ_H.XPT
Downloading file DBQ_H.XPT
Downloading file DLQ_H.XPT
Downloading file DUQ_H.XPT
Downloading file ECQ_H.XPT
Downloading file FSQ_H.XPT
Downloading file HIQ_H.XPT
Downloading file HEQ_H.XPT
Downloading file HUQ_H.XPT
Downloading file HOQ_H.XPT
Downloading file IMQ_H.XPT
Downloading file INQ_H.XPT
Downloading file KIQ_U_H.XPT
Downloading file MCQ_H.XPT
Downloading file DPQ_H.XPT
Downloading file OCQ_H.XPT
Downloading file OHQ_H.XPT
Downloading file OSQ_H.XPT
Downloading file PAQ_H.XPT
Downloading file PFQ_H.XPT
Downloading file RXQASA_H.XPT
Downloading file RHQ_H.XPT
Downloading file SXQ_H.XPT
Downloading file SLQ_H.XPT
Downloading file SMQFAM_H.XPT
Downloading file SMQRTU_H.XPT
Downloading file SMQSHS_H.XPT
Downloading file CSQ_H.XPT
Downloading file VTQ_H.XPT
Downloading file WHQ_H.XPT
Downloading file WHQMEC_H.XPT
converting  Acculturation :  /opt/conda/lib/python3.7/site-packages/aix360/datasets/../data/cdc_data/ACQ_H.XPT  to  /opt/conda/lib/python3.7/site-packages/aix360/datasets/../data/cdc_data/csv/ACQ_H.csv

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-2-c372c6c55e63> in <module>
----> 1 nhanes = CDCDataset()
      2 nhanes_files = nhanes.get_csv_file_names()
      3 (nhanesinfo, _, _) = nhanes._cdc_files_info()

/opt/conda/lib/python3.7/site-packages/aix360/datasets/cdc_dataset.py in __init__(self, custom_preprocessing, dirpath)
     49                 sys.exit(1)
     50 
---> 51         self._convert_xpt_to_csv()
     52         #if custom_preprocessing:
     53         #    self._data = custom_preprocessing(df)

/opt/conda/lib/python3.7/site-packages/aix360/datasets/cdc_dataset.py in _convert_xpt_to_csv(self)
    133                 with open(xptfile, 'rb') as in_xpt:
    134                     with open(csvfile, 'w',newline='') as out_csv:
--> 135                         reader = xport.Reader(in_xpt)
    136                         writer = csv.writer(out_csv, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    137                         writer.writerow(reader.fields)

/opt/conda/lib/python3.7/site-packages/xport/__init__.py in __init__(self, fp)
    768 
    769     def __init__(self, fp):
--> 770         self.dataset = to_dataframe(fp)
    771 
    772     def __iter__(self):

/opt/conda/lib/python3.7/site-packages/xport/__init__.py in to_dataframe(fp)
    749     from xport.v56 import load
    750     warnings.warn('Please use ``xport.v56.load`` in the future', DeprecationWarning)
--> 751     library = load(fp)
    752     dataset = next(iter(library.values()))
    753     return dataset

/opt/conda/lib/python3.7/site-packages/xport/v56.py in load(fp)
    898     except UnicodeDecodeError:
    899         raise TypeError(f'Expected a BufferedReader in bytes-mode, got {type(fp).__name__}')
--> 900     return loads(bytestring)
    901 
    902 

/opt/conda/lib/python3.7/site-packages/xport/v56.py in loads(bytestring)
    909         >>> library = loads(bytestring)
    910     """
--> 911     return Library.from_bytes(bytestring)
    912 
    913 

/opt/conda/lib/python3.7/site-packages/xport/v56.py in from_bytes(cls, bytestring, member_header_re)
    703             modified=strptime(mo['modified']),
    704             sas_os=mo['os'].strip(b'\x00').decode('ISO-8859-1').strip(),
--> 705             sas_version=mo['version'].strip(b'\x00').decode('ISO-8859-1').strip(),
    706         )
    707         LOG.info(f'Decoded {self}')

/opt/conda/lib/python3.7/site-packages/xport/__init__.py in __init__(self, members, created, modified, sas_os, sas_version)
    587                 self[name] = dataset  # Use __setitem__ to validate metadata.
    588         else:
--> 589             for dataset in members:
    590                 if dataset.name in self:
    591                     warnings.warn(f'More than one dataset named {dataset.name!r}')

/opt/conda/lib/python3.7/site-packages/xport/v56.py in from_bytes(cls, bytestring, pattern)
    605         head = cls.from_header(header)
    606         data = Member(pd.DataFrame.from_records(observations, columns=list(header)))
--> 607         data.copy_metadata(head)
    608         LOG.info(f'Decoded XPORT dataset {data.name!r}')
    609         LOG.debug('%s', data)

/opt/conda/lib/python3.7/site-packages/xport/__init__.py in copy_metadata(self, other)
    410                 object.__setattr__(self, name, getattr(other, name, None))
    411         if isinstance(other, (Dataset, Mapping)):
--> 412             for k, v in self.items():
    413                 try:
    414                     v.copy_metadata(other[k])

~/.local/lib/python3.7/site-packages/pandas/core/frame.py in items(self)
   1015         if self.columns.is_unique and hasattr(self, "_item_cache"):
   1016             for k in self.columns:
-> 1017                 yield k, self._get_item_cache(k)
   1018         else:
   1019             for i, k in enumerate(self.columns):

~/.local/lib/python3.7/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
   3792             loc = self.columns.get_loc(item)
   3793             values = self._mgr.iget(loc)
-> 3794             res = self._box_col_values(values, loc).__finalize__(self)
   3795 
   3796             cache[item] = res

~/.local/lib/python3.7/site-packages/pandas/core/frame.py in _box_col_values(self, values, loc)
   3312         name = self.columns[loc]
   3313         klass = self._constructor_sliced
-> 3314         return klass(values, index=self.index, name=name, fastpath=True)
   3315 
   3316     # ----------------------------------------------------------------------

/opt/conda/lib/python3.7/site-packages/xport/__init__.py in __init__(self, data, index, dtype, name, copy, fastpath, label, vtype, width, format, informat, **kwds)
    308         for name, value in metadata.items():
    309             setattr(self, name, getattr(self, name, value))
--> 310         LOG.debug(f'Initialized {self}')
    311 
    312     def __finalize__(self, other, method=None, **kwds):

/opt/conda/lib/python3.7/site-packages/xport/__init__.py in __repr__(self)
    274         metadata = {name: getattr(self, name) for name in metadata}
    275         metadata = (f'{name}: {value}' for name, value in metadata.items() if value is not None)
--> 276         return f'{type(self).__name__}\n{super().__repr__()}\n{", ".join(metadata)}'
    277 
    278     def __init__(

~/.local/lib/python3.7/site-packages/pandas/core/series.py in __repr__(self)
   1305             min_rows=min_rows,
   1306             max_rows=max_rows,
-> 1307             length=show_dimensions,
   1308         )
   1309         result = buf.getvalue()

~/.local/lib/python3.7/site-packages/pandas/core/series.py in to_string(self, buf, na_rep, float_format, header, index, length, dtype, name, max_rows, min_rows)
   1368             float_format=float_format,
   1369             min_rows=min_rows,
-> 1370             max_rows=max_rows,
   1371         )
   1372         result = formatter.to_string()

~/.local/lib/python3.7/site-packages/pandas/io/formats/format.py in __init__(self, series, buf, length, header, index, na_rep, name, float_format, dtype, max_rows, min_rows)
    270         self.adj = get_adjustment()
    271 
--> 272         self._chk_truncate()
    273 
    274     def _chk_truncate(self) -> None:

~/.local/lib/python3.7/site-packages/pandas/io/formats/format.py in _chk_truncate(self)
    292             else:
    293                 row_num = max_rows // 2
--> 294                 series = concat((series.iloc[:row_num], series.iloc[-row_num:]))
    295             self.tr_row_num = row_num
    296         else:

~/.local/lib/python3.7/site-packages/pandas/core/reshape/concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    293         verify_integrity=verify_integrity,
    294         copy=copy,
--> 295         sort=sort,
    296     )
    297 

~/.local/lib/python3.7/site-packages/pandas/core/reshape/concat.py in __init__(self, objs, axis, join, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    404         # Standardize axis parameter to int
    405         if isinstance(sample, ABCSeries):
--> 406             axis = sample._constructor_expanddim._get_axis_number(axis)
    407         else:
    408             axis = sample._get_axis_number(axis)

/opt/conda/lib/python3.7/site-packages/xport/__init__.py in _constructor_expanddim(self)
    338         For example, transforming a series into a dataframe.
    339         """
--> 340         raise NotImplementedError("Can't copy SAS variable metadata to dataframe")
    341 
    342     @property

NotImplementedError: Can't copy SAS variable metadata to dataframe

Here are the package versions I have installed:

absl-py==0.11.0
aix360 @ file:///home/jovyan/AIX360-master
alembic==1.4.2
analytics-python==1.2.9
asgiref==3.3.1
astor==0.8.1
async-generator==1.10
attrs==20.3.0
backcall @ file:///home/conda/feedstock_root/build_artifacts/backcall_1592338393461/work
bamboolib==1.22.2
beautifulsoup4 @ file:///home/conda/feedstock_root/build_artifacts/beautifulsoup4_1589761456552/work
bleach @ file:///home/conda/feedstock_root/build_artifacts/bleach_1588608214987/work
blinker==1.4
bokeh @ file:///home/conda/feedstock_root/build_artifacts/bokeh_1592227515025/work
Bottleneck==1.3.2
Brotli==1.0.9
brotlipy==0.7.0
certifi==2020.4.5.2
certipy==0.1.3
cffi==1.14.0
chardet==3.0.4
click==7.1.2
cloudpickle @ file:///home/conda/feedstock_root/build_artifacts/cloudpickle_1588164361239/work
conda==4.8.2
conda-package-handling==1.6.0
cryptography==2.9.2
cvxopt==1.2.6
cvxpy==1.1.11
cycler==0.10.0
Cython @ file:///home/conda/feedstock_root/build_artifacts/cython_1591799499719/work
cytoolz==0.10.1
dash==1.19.0
dash-core-components==1.15.0
dash-cytoscape==0.2.0
dash-html-components==1.1.2
dash-renderer==1.9.0
dash-table==4.11.2
dask==2.15.0
decorator==4.4.2
defusedxml==0.6.0
dill @ file:///home/conda/feedstock_root/build_artifacts/dill_1592315758554/work
distributed @ file:///home/conda/feedstock_root/build_artifacts/distributed_1591409248443/work
Django==3.1.7
docutils==0.16
ecos==2.0.7.post1
entrypoints==0.3
fastcache==1.1.0
Flask==1.1.2
Flask-Compress==1.9.0
fsspec @ file:///home/conda/feedstock_root/build_artifacts/fsspec_1589989738418/work
future==0.18.2
gast==0.4.0
geographiclib==1.50
geopy==2.1.0
gevent==21.1.2
gmpy2==2.1.0b1
google-pasta==0.2.0
graphviz==0.16
greenlet==1.0.0
grpcio==1.36.1
h5py==2.10.0
HeapDict==1.0.1
idna==2.9
image==1.5.33
imageio==2.8.0
importlib-metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1591451751445/work
interpret==0.2.4
interpret-core==0.2.4
ipykernel @ file:///home/conda/feedstock_root/build_artifacts/ipykernel_1590020200501/work/dist/ipykernel-5.3.0-py3-none-any.whl
ipympl==0.5.6
ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1590796899444/work
ipython-genutils==0.2.0
ipywidgets==7.5.1
itsdangerous==1.1.0
jedi==0.17.0
Jinja2==2.11.2
joblib @ file:///home/conda/feedstock_root/build_artifacts/joblib_1589812474002/work
json5 @ file:///home/conda/feedstock_root/build_artifacts/json5_1591810480056/work
jsonschema==3.2.0
jupyter-client==6.1.3
jupyter-core==4.6.3
jupyter-telemetry==0.0.5
jupyterhub==1.1.0
jupyterlab==2.1.3
jupyterlab-server @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_server_1590229434073/work
Keras==2.3.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.2
kiwisolver==1.2.0
lime==0.1.1.37
llvmlite==0.31.0
locket==0.2.0
Mako==1.1.0
Markdown==3.3.4
MarkupSafe==1.1.1
matplotlib==3.2.1
mistune==0.8.4
mock @ file:///home/conda/feedstock_root/build_artifacts/mock_1588618847833/work
mpmath==1.1.0
msgpack==1.0.0
nbconvert==5.6.1
nbformat==5.0.6
networkx==2.4
notebook @ file:///home/conda/feedstock_root/build_artifacts/notebook_1588887226267/work
numba==0.48.0
numexpr==2.7.1
numpy @ file:///home/conda/feedstock_root/build_artifacts/numpy_1591485215893/work
oauthlib==3.0.1
olefile==0.46
osqp==0.6.2.post0
packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1589925210001/work
pamela==1.0.0
pandas==1.2.3
pandocfilters==1.4.2
parso==0.7.0
partd==1.1.0
patsy==0.5.1
PDPbox==0.2.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.1.2
plotly==4.14.3
ppscore==1.2.0
progressbar==2.5
prometheus-client @ file:///home/conda/feedstock_root/build_artifacts/prometheus_client_1590412252446/work
prompt-toolkit==3.0.5
protobuf==3.11.4
psutil==5.7.0
ptyprocess==0.6.0
pycosat==0.6.3
pycparser==2.20
pycurl==7.43.0.5
Pygments==2.6.1
PyJWT==1.7.1
pyOpenSSL==19.1.0
pyparsing==2.4.7
pyrsistent==0.16.0
PySocks==1.7.1
python-dateutil==2.8.1
python-editor==1.0.4
python-json-logger==0.1.11
pytz==2020.1
PyWavelets==1.1.1
PyYAML==5.3.1
pyzmq==19.0.1
qdldl==0.1.5.post0
qgrid==1.3.1
qpsolvers==1.5
quadprog==0.1.8
requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1592425495151/work
retrying==1.3.3
rpy2==3.1.0
ruamel-yaml==0.15.80
ruamel.yaml.clib==0.2.0
SALib==1.3.12
scikit-image==0.16.2
scikit-learn==0.22.2.post1
scipy==1.4.1
scs==2.1.2
seaborn @ file:///home/conda/feedstock_root/build_artifacts/seaborn-base_1591878760859/work
Send2Trash==1.5.0
shap==0.34.0
simplegeneric==0.8.1
six @ file:///home/conda/feedstock_root/build_artifacts/six_1590081179328/work
skope-rules==1.0.1
sortedcontainers @ file:///home/conda/feedstock_root/build_artifacts/sortedcontainers_1591999956871/work
soupsieve @ file:///home/conda/feedstock_root/build_artifacts/soupsieve_1589778966114/work
SQLAlchemy @ file:///home/conda/feedstock_root/build_artifacts/sqlalchemy_1589421717839/work
sqlparse==0.4.1
statsmodels @ file:///home/conda/feedstock_root/build_artifacts/statsmodels_1591963256838/work
sympy==1.5.1
tables==3.6.1
tblib==1.6.0
tensorboard==1.14.0
tensorflow==1.14.0
tensorflow-estimator==1.14.0
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
toml==0.10.2
toolz==0.10.0
torch==1.8.0
torchvision==0.9.0
tornado==6.0.4
tqdm @ file:///home/conda/feedstock_root/build_artifacts/tqdm_1591181521996/work
traitlets==4.3.3
treeinterpreter==0.2.3
tslearn==0.5.0.5
typing-extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1588470653596/work
tzlocal @ file:///home/conda/feedstock_root/build_artifacts/tzlocal_1588939190034/work
urllib3==1.25.9
vincent==0.4.4
wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1591600393557/work
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
wrapt==1.12.1
xgboost==1.0.2
xlrd==1.2.0
xport==3.2.1
zict==2.0.0
zipp==3.1.0
zope.event==4.5.0
zope.interface==5.2.0

More modular dependencies

Our environments expand 5x in size because all these dependencies that we don't use are installed. It makes our CI builds take longer.
Offering something like extras_requires{'tensorflow': ['tensorflow>=1.14'], ...} should greatly help.
The docutils package isn't even necessary.

CEM for Multi-label Classification

Inspecting the KerasClassifier class for CEM, I could see that it is made specifically for single label classification:
predicted_class = np.argmax(prob)

It may be important to prepare this class to handle Multi Label Multi Class classification, maybe allowing the programmer to select which of the N target classes to take into account when explaining with CEMExplainer.

Please help me download the FICO HELOC dataset

I could NOT find the download url even after I filled the form a few times in this page from this tutorial . I guess the service is closed now. I also searched on Google and Kaggle, nothing found.

Could you kindly help me download the data or directly send a copy of data to me?

Error from running BRCG in beam_search.py

I'm trying to run BRCG on some self-generated binary data (note: the generated data is such that there exists a DNF rule that perfectly matches X with y). The first dataset is very small, and looks as follows:

X=
   0  1  2  3  4
0  0  1  1  0  0
1  1  0  1  0  0
2  1  1  1  0  0
3  1  1  0  0  1
4  0  1  0  1  0
5  0  1  1  1  0
6  1  0  0  0  1
7  1  1  1  1  1
8  1  0  1  0  1
9  1  0  1  1  0

y=
0  0
1  0
2  0
3  1
4  1
5  1
6  1
7  1
8  1
9  1

where X is a pandas dataframe and y is a pandas series. I then use the following:

    br = BooleanRuleCG(lambda0=1e-3, lambda1=1e-3)
    br.fit(Xdf, ydf)

For some datasets this works fine, but for other datasets it gives the following error:

Initial LP solved
Traceback (most recent call last):
File "dash.py", line 37, in
br.fit(Xdf, ydf)
File "/home/marleen/miniconda3/envs/aix360_env/lib/python3.7/site-packages/aix360/algorithms/rbm/boolean_rule_cg.py", line 120, in fit
K=self.K, UB=UB, D=self.D, B=self.B, eps=self.eps)
File "/home/marleen/miniconda3/envs/aix360_env/lib/python3.7/site-packages/aix360/algorithms/rbm/beam_search.py", line 284, in beam_search
colKeep = pd.Series(Xp.columns.get_level_values(0) != i[0], index=Xp.columns)
IndexError: invalid index to scalar variable.

Any idea what is going wrong?

aix360 installation GCP and on Windows

I am trying to install aix360 in VM google cloud and also try to install it on anaconda enviornment (on windows) and both times it gives me the same error.
I am following the same steps which I found in the documentation, but still facing the same error. If someone installed this toolkit already can you please help me setting up the enviornment for this toolkit.
Below is the error log:
ERROR: Command errored out with exit status 1:
command: /home/mansoor_working/anaconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'
/tmp/pip-install-zt5ypyrs/cvxpy/setup.py'"'"'; file='"'"'/tmp/pip-install-zt5ypyrs/cvxpy/setup.py'"'"';f=getatt
r(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(comp
ile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-uy3pz4nb --python-tag cp37
cwd: /tmp/pip-install-zt5ypyrs/cvxpy/
Complete output (369 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/cvxpy
copying cvxpy/init.py -> build/lib.linux-x86_64-3.7/cvxpy
copying cvxpy/error.py -> build/lib.linux-x86_64-3.7/cvxpy
copying cvxpy/settings.py -> build/lib.linux-x86_64-3.7/cvxpy
creating build/lib.linux-x86_64-3.7/cvxpy/problems
copying cvxpy/problems/init.py -> build/lib.linux-x86_64-3.7/cvxpy/problems
copying cvxpy/problems/objective.py -> build/lib.linux-x86_64-3.7/cvxpy/problems
copying cvxpy/problems/xpress_problem.py -> build/lib.linux-x86_64-3.7/cvxpy/problems
copying cvxpy/problems/problem.py -> build/lib.linux-x86_64-3.7/cvxpy/problems
copying cvxpy/problems/iterative.py -> build/lib.linux-x86_64-3.7/cvxpy/problems
creating build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_objectives.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_expressions.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_curvature.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_quadratic.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_dgp.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_solvers.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/init.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_monotonicity.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_quad_form.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_benchmarks.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_super_scs.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/base_test.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_mip_vars.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_qp.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_matrices.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_problem.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_constraints.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_atoms.py -> build/lib.linux-x86_64-3.7/cvxpy/tests
copying cvxpy/tests/test_scs.py -> build/lib.linux-x86_64-3.7/cvxpy/tests

ExternalRiskEstimate seems to be hard coded into HELOC data processing, but I cannot find it.

Screen Shot 2021-05-19 at 11 51 10 AM

If I change the name:


ValueError Traceback (most recent call last)
in
2 from aix360.algorithms.rbm import FeatureBinarizer
3 fb = FeatureBinarizer(negations=True, returnOrd=True)
----> 4 dfTrain, dfTrainStd = fb.fit_transform(dfTrain)
5 dfTest, dfTestStd = fb.transform(dfTest)
6 dfTrain['MostRecentBillAmountRaw'].head()

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
697 if y is None:
698 # fit method of arity 1 (unsupervised transformation)
--> 699 return self.fit(X, **fit_params).transform(X)
700 else:
701 # fit method of arity 2 (supervised transformation)

~/PycharmProjects/AIX360/aix360/algorithms/rbm/features.py in fit(self, X)
111 self.ordinal = ordinal
112 # Fit StandardScaler to ordinal features
--> 113 self.scaler = StandardScaler().fit(data[ordinal])
114 return self
115

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/preprocessing/_data.py in fit(self, X, y, sample_weight)
728 # Reset internal state before fitting
729 self._reset()
--> 730 return self.partial_fit(X, y, sample_weight)
731
732 def partial_fit(self, X, y=None, sample_weight=None):

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/preprocessing/_data.py in partial_fit(self, X, y, sample_weight)
766 X = self._validate_data(X, accept_sparse=('csr', 'csc'),
767 estimator=self, dtype=FLOAT_DTYPES,
--> 768 force_all_finite='allow-nan', reset=first_call)
769 n_features = X.shape[1]
770

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
419 out = X
420 elif isinstance(y, str) and y == 'no_validation':
--> 421 X = check_array(X, **check_params)
422 out = X
423 else:

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
538
539 if all(isinstance(dtype, np.dtype) for dtype in dtypes_orig):
--> 540 dtype_orig = np.result_type(*dtypes_orig)
541
542 if dtype_numeric:

<array_function internals> in result_type(*args, **kwargs)

ValueError: at least one array or dtype is required

There is lot of dependency on specific versions of packages. Can we have atleast >= 'package_name'

ERROR: xai 0.0.5 has requirement matplotlib==3.0.2, but you'll have matplotlib 3.1.0 which is incompatible.
ERROR: xai 0.0.5 has requirement numpy==1.15.4, but you'll have numpy 1.16.4 which is incompatible.
ERROR: xai 0.0.5 has requirement pandas==0.23.4, but you'll have pandas 0.24.2 which is incompatible.
ERROR: xai 0.0.5 has requirement scikit-learn==0.20.1, but you'll have scikit-learn 0.23.1 which is incompatible.

CEMExplainer support for tabular data with categorical features

I'm looking to use the contrastive explainer on tabular data which the docs state is supported.

What is the recommended mechanism to deal with categorical features for this explainer?

I've one-hot encoded and then normalized like so:

c_transformer = Pipeline(steps=[('onehot', OneHotEncoder(handle_unknown='ignore')),
                                ('functr', FunctionTransformer(lambda x: x.toarray(), accept_sparse=True)),
                                ('scalar', MinMaxScaler(feature_range=(-0.5, 0.5)))])

The resulting pertinent negatives and positives adjust all values of a category. As an example, here is the delta_pn (which I understand to be the difference needed change the classification) for the sex feature which is binary in this dataset.

sex_Female                                   0.500000
sex_Male                                    -0.500000

The change impacts both categories. Its unclear how to do the inverse transform for these cases when using one-hot encoding.

Error with finding pertinent positives and negatives using the CEMExplainer

I am using CEMExplainer to generate explanations for my dataset of the following shape: (91,3)

However, I face the following error:


IndexError Traceback (most recent call last)
in
10
11 (adv_pn, delta_pn, info_pn) = explainer.explain_instance(train_dataset, arg_mode, ae_model, arg_kappa, arg_b,
---> 12 arg_max_iter, arg_init_const, arg_beta, arg_gamma)

~/AIX360/aix360/algorithms/contrastive/CEM.py in explain_instance(self, input_X, arg_mode, AE_model, arg_kappa, arg_b, arg_max_iter, arg_init_const, arg_beta, arg_gamma)
75 target_label = orig_class
76
---> 77 target = np.array([np.eye(self._wbmodel._nb_classes)[target_label]])
78
79 # Hard coding batch_size=1

IndexError: index 80 is out of bounds for axis 0 with size 1


The code that I use to find pertinent negative/postitive:


mymodel = KerasClassifier(pred_model)

explainer = CEMExplainer(mymodel)

arg_mode = "PN" # Find pertinent negative

arg_max_iter = 1000 # Maximum number of iterations to search for the optimal PN for given parameter settings
arg_init_const = 10.0 # Initial coefficient value for main loss term that encourages class change
arg_b = 9 # No. of updates to the coefficient of the main loss term

arg_kappa = 10 # Minimum confidence gap between the PNs (changed) class probability and original class' probability
arg_beta = 1e-1 # Controls sparsity of the solution (L1 loss)
arg_gamma = 100 # Controls how much to adhere to a (optionally trained) autoencoder

(adv_pn, delta_pn, info_pn) = explainer.explain_instance(train_dataset, arg_mode, ae_model, arg_kappa, arg_b,
arg_max_iter, arg_init_const, arg_beta, arg_gamma)


FeatureBinarizer: Skipping column 'X': data type cannot be handled

The solution to this problem is a simple one-liner. I will submit a pull request along with this issue. I'm just documenting the error so other people can find it.

FeatureBinarizer prints "Skipping column 'X': data type cannot be handled" for various integer and float subtypes that should be handled. The following code reproduces the problem.

import numpy as np
import pandas as pd
from aix360.algorithms.rbm import FeatureBinarizer

dtypes = np.dtype([
          ('int32', np.int32),
          ('int64', np.int64),
          ('float32', np.float32),
          ('float64', np.float64),
])

X = pd.DataFrame(np.array(np.arange(100)).astype(dtypes))
print(X.dtypes)

# Both fit and transform do not handle int64 and float32 though there should be no problem...
fb = FeatureBinarizer()
fb.fit(X);
fb.transform(X);

# Skipping column 'int64': data type cannot be handled
# Skipping column 'float32': data type cannot be handled
# Skipping column 'int64': data type cannot be handled
# Skipping column 'float32': data type cannot be handled

The problem is that the FeatureBinarizer class tests the type as follows:

for c in X.columns:
    if np.issubdtype(X[c].dtype, np.dtype(int).type) | np.issubdtype(X[c].dtype, np.dtype(float).type):
        pass
    else:
        print(("Skipping column '" + str(c) + "': data type cannot be handled"))

# Skipping column 'int64': data type cannot be handled
# Skipping column 'float32': data type cannot be handled

This can be resolved by using the generic numpy types integer and floating in the test.

for c in X.columns:
    if np.issubdtype(X[c].dtype, np.integer) | np.issubdtype(X[c].dtype, np.floating):
        pass
    else:
        print(("Skipping column '" + str(c) + "': data type cannot be handled"))

LIME explain_instance documentation discrepancy

lime-ml.readthedocs says that explain_instance for tabular lime expects a 1D-array as input, but when running the code with 1D-array the following error message occurs:
ValueError: Expected 2D array, got 1D array instead:
[1, 2, 3, 4, 5]

Does the function expect 2D or 1D- arrays?

Edit:
Should probably mention that I get the same error message when passing 2D arrays (for example [[1,2,3,4,5]]). The problem probably lies elsewhere, but the error message is not very helpful.

Protodash

Is the benefit of protodash that the data does not have to be normalised? And the other one that you get weights. I am just trying to understand why one should use protodash and not a nearest neighbour algorithm. Thanks in advance.

FICO HELOC dataset

Hi!
I have also problems with downloading the FICO HELOC data set. I fill in the requested information and click the Send button nothing happens....

Anyone who may please help me downloading the data set?

Kind regards,
Kjersti

Error occurring when running BRCG

I'm trying to run BRCG on some self-generated binary data (note: the generated data is such that there exists a DNF rule that perfectly matches X with y). The first dataset is very small, and looks as follows:

X=
   0  1  2  3  4
0  0  1  1  0  0
1  1  0  1  0  0
2  1  1  1  0  0
3  1  1  0  0  1
4  0  1  0  1  0
5  0  1  1  1  0
6  1  0  0  0  1
7  1  1  1  1  1
8  1  0  1  0  1
9  1  0  1  1  0

y=
   0
0  0
1  0
2  0
3  1
4  1
5  1
6  1
7  1
8  1
9  1

I then use the following:

    br = BooleanRuleCG(lambda0=1e-3, lambda1=1e-3)
    br.fit(Xdf, ydf)

The error I get is:

Initial LP solved
Traceback (most recent call last):
File "test.py", line 40, in
br.fit(Xdf, ydf)
File "/home/marleen/miniconda3/envs/aix360_env/lib/python3.7/site-packages/aix360/algorithms/rbm/boolean_rule_cg.py", line 113, in fit
r[P] = -constraints[0].dual_value
ValueError: shape mismatch: value array of shape (7,) could not be broadcast to indexing result of shape (7,1)

Do you know how to resolve this? Thanks!

ValueError with using ProtoDash to get the Prototypes of a Dataset

Hi!

I'm encountering an error with a simple use case of ProtoDash to get prototypes of a given dataset.
Here's an example that triggers the error:

import pandas as pd
from sklearn import datasets
from aix360.algorithms.protodash import PDASH

# Load Iris
X, y = datasets.load_iris(True)
df = pd.DataFrame(X, columns=range(X.shape[1]))
df['y'] = y

tmp = df[df['y'] == 0].drop('y', axis=1).values
X_1 = PDASH.HeuristicSetSelection(X=tmp, Y=tmp, m=10, kernelType='gaussian', sigma=2)

# This generates an error:
# ---------------------------------------------------------------------------
# ValueError                                Traceback (most recent call last)
# <ipython-input-48-e631ba33f62a> in <module>
#      1 tmp = df[df['y'] == 0].drop('y', axis=1).values
# ----> 2 X_1 = PDASH.HeuristicSetSelection(X=tmp, Y=tmp, m=10, kernelType='gaussian', sigma=2)
#
# c:\users\pc\aix\aix360\aix360\algorithms\protodash\PDASH_utils.py in HeuristicSetSelection(X, Y, m, kernelType, sigma)
#    267             currK = K2
#    268             if maxGradient <= 0:
#--> 269                 newCurrOptw = np.vstack((currOptw[:], np.array([0])))
#    270                 newCurrSetValue = currSetValue
#    271             else:
#
#~\AppData\Local\Continuum\anaconda3\envs\aix360\lib\site-packages\numpy\core\shape_base.py in vstack(tup)
#    281     """
#    282     _warn_for_nonsequence(tup)
#--> 283     return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
#    284 
#    285 
#
#ValueError: all the input array dimensions except for the concatenation axis must match exactly

Interestingly, the error does not pop up for m < 10.

Is this a bug or am I using it incorrectly?

Thanks,

FeatureBinarizer: existence of missing values in binary features throws an error

FeatureBinarizer throws a ValueError: Length of passed values is 1, index implies 2 when passed a binary feature with missing values.

This happens in FeatureBinarizer L70, when trying to creating a Series.

For these cases Pandas Series treats nunique and unique differently. While nunique ignores nan's unique doesn't.

Example:

x = pd.Series([0, np.nan, 1])
print(x.nunique(), len(x.unique()))

returns

2 3

Versions:
AIX360: 0.2.0
Pandas: 0.25.3

Installation on MacOS 10.14.6 fails on xgboost

I followed these steps to install aix360 on MacOS 10.14.6

  1. conda create --name aix360 python=3.6
  2. conda activate aix360
  3. git clone https://github.com/IBM/AIX360
  4. cd AIX360/
  5. pip install -e .

the command failed with the following error:

(aix360) ~/AIX360 [master] $ pip install -e . Obtaining file:///Users/fchiossi/AIX360 Collecting joblib>=0.11 Using cached joblib-0.14.1-py2.py3-none-any.whl (294 kB) Collecting scikit-learn>=0.21.2 Using cached scikit_learn-0.22.1-cp36-cp36m-macosx_10_6_intel.whl (11.1 MB) Collecting torch Using cached torch-1.4.0-cp36-none-macosx_10_9_x86_64.whl (81.1 MB) Collecting torchvision Using cached torchvision-0.5.0-cp36-cp36m-macosx_10_9_x86_64.whl (438 kB) Collecting cvxpy Using cached cvxpy-1.0.28-cp36-cp36m-macosx_10_9_x86_64.whl (745 kB) Collecting cvxopt Using cached cvxopt-1.2.4-cp36-cp36m-macosx_10_9_x86_64.whl (3.1 MB) Collecting Image Using cached image-1.5.28.tar.gz (15 kB) Collecting keras Using cached Keras-2.3.1-py2.py3-none-any.whl (377 kB) Collecting matplotlib Using cached matplotlib-3.1.3-cp36-cp36m-macosx_10_9_x86_64.whl (13.2 MB) Collecting numpy Using cached numpy-1.18.1-cp36-cp36m-macosx_10_9_x86_64.whl (15.2 MB) Collecting pandas Using cached pandas-1.0.1-cp36-cp36m-macosx_10_9_x86_64.whl (9.9 MB) Collecting scipy>=0.17 Using cached scipy-1.4.1-cp36-cp36m-macosx_10_6_intel.whl (28.5 MB) Collecting tensorflow==1.14 Using cached tensorflow-1.14.0-cp36-cp36m-macosx_10_11_x86_64.whl (105.8 MB) Collecting xport Using cached xport-2.0.2-py2.py3-none-any.whl (14 kB) Collecting scikit-image Using cached scikit_image-0.16.2-cp36-cp36m-macosx_10_6_intel.whl (30.4 MB) Collecting requests Using cached requests-2.23.0-py2.py3-none-any.whl (58 kB) Collecting lime Using cached lime-0.1.1.37.tar.gz (275 kB) Collecting shap Using cached shap-0.34.0.tar.gz (264 kB) Collecting xgboost Using cached xgboost-1.0.1.tar.gz (820 kB) ERROR: Command errored out with exit status 1: command: /Users/fchiossi/opt/anaconda3/envs/aix360/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/setup.py'"'"'; __file__='"'"'/private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/pip-egg-info cwd: /private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/ Complete output (27 lines): ++ pwd + oldpath=/private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost + cd ./xgboost/ + mkdir -p build + cd build + cmake .. ./xgboost/build-python.sh: line 21: cmake: command not found + echo ----------------------------- ----------------------------- + echo 'Building multi-thread xgboost failed' Building multi-thread xgboost failed + echo 'Start to build single-thread xgboost' Start to build single-thread xgboost + cmake .. -DUSE_OPENMP=0 ./xgboost/build-python.sh: line 27: cmake: command not found Traceback (most recent call last): File "<string>", line 1, in <module> File "/private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/setup.py", line 42, in <module> LIB_PATH = libpath['find_lib_path']() File "/private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/xgboost/libpath.py", line 50, in find_lib_path 'List of candidates:\n' + ('\n'.join(dll_path))) XGBoostLibraryNotFound: Cannot find XGBoost Library in the candidate path, did you install compilers and run build.sh in root path? List of candidates: /private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/xgboost/libxgboost.dylib /private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/xgboost/../../lib/libxgboost.dylib /private/var/folders/j1/x8bvblx563n247csfz521xkr0000gn/T/pip-install-lrjswknu/xgboost/xgboost/./lib/libxgboost.dylib /Users/fchiossi/opt/anaconda3/envs/aix360/xgboost/libxgboost.dylib ---------------------------------------- ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

How can I fix it?

Thanks in advance

Suggestion to improve GLRM sklearn compability

Hello, I'm trying to use the GLRM LogisticRuleRegression and it seems to be compatible with my own code for training/evaluation with sklearn models. However, it fails when I use functions like GridSearchCV for hyperparameter tuning.

TypeError: Cannot clone object '<aix360.algorithms.rbm.logistic_regression.LogisticRuleRegression object at 0x10f731310>' (type <class 'aix360.algorithms.rbm.logistic_regression.LogisticRuleRegression'>): it does not seem to be a scikit-learn estimator as it does not implement a 'get_params' methods. (py37) Jamess-MacBook-Pro-2:sk

If the class inherits from BaseEstimator and ClassifierMixin from sklearn.base instead of just object, then it will inherit get_params() and this will resolve the issue. I've tested this on my local machine. So the change should be:

class LogisticRuleRegression(object):
to
class LogisticRuleRegression(BaseEstimator, ClassifierMixin):

This can also be applied to LinearRuleRegression (replacing ClassifierMixin with RegressorMixin) and any other similar classes and may resolve other sklearn compatability issues I haven't come across yet (e.g. Pipeline may be affected as well)

ProtoDash for images

Is it possible to use ProtoDash for images?

If we use it in an embedding space any recommendations on embedding space?

Thank you in advance.

unable to import HELOCDataset

Hello, i am exploring AIX360 for the first time, i wanted to start off by executing the demo notebook of Credit approval usecase.

I have downloaded the heloc_dataset.csv and placed it in the respective folder.

While executing the import statement:
from aix360.datasets.heloc_dataset import HELOCDataset, nan_preprocessing

The below warnings are getting displayed and the cell stops execution.

Using TensorFlow backend.
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\framework\dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\framework\dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\framework\dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\framework\dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\framework\dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\framework\dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
C:\Users\dalavayi.navya\Anaconda3\envs\tf\lib\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])

I have tried restarting the terminal, uninstalling and reinstalling aix360, but the same issue persists.

Please help me how to resolve this.

Error occurs when using negated binary columns with FeatureBinarizer

Pull request for the solution: #111

I've just noticed that the FeatureBinarizer, when including the negated columns as well, does not work when using a dataset where there is a binary categorical feature. That's probably another Pandas version error, where the 1.0.0 or newer Pandas versions work significantly different than they previously did. (Got the error using Pandas 0.25.3)

When calling fb.fit_transform(<dataset_with_binary_category>, negations=True)
The error message was: TypeError: unsupported operand type(s) for -: 'int' and 'Categorical'
At line 142. in function transform(): A[(str(c), 'not', '')] = 1 - A[(str(c), '', '')]
where A[(str(c), '', '')] = data[c].map(maps[c]) and c is a specific column

At that line the substraction does not work, because the Series A[(str(c), '', '')] is categorical.

Solution:
For a solution just convert the type of A[(str(c), '', '')] to integer as A[(str(c), '', '')] = data[c].map(maps[c]).astype(int). Although it could be solvable in many formats, I've seen the pattern astype(int) elsewhere in the codebase, so I hope that the solution is satisfactory.

beam_search_K1 with pandas > 1.1.0

With pandas version > 1.1.0, the line 148 (145, and 150) return an error: ValueError: cannot reindex from a duplicate axis.
Locally, locally I just added '.values'. Example of line 148:
colKeep[i[0]] = ((Xp[i[0]].columns.get_level_values(0) == '<=') & (thresh > i[2])).values

CEM_MAFImageExplainer - broken Example Notebook

First, I want to thank you very much for providing this toolkit! I am eager to use your implementation for my own research!

Unfortunately, as I was working through the example "CEM-MAF-CelebA.ipynb" notebook for contrastive explanations, I was stopped dead while obtaining the pertinent negative explanation. (Code chunk 12)

Error message:

InvalidArgumentError: Conv2DCustomBackpropInputOp only supports NHWC.
	 [[{{node gradients/G_paper_1_1/cond/ToRGB_lod8/Conv2D_grad/Conv2DBackpropInput}}]]

During handling of the above exception, another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-13-b1a3ab914e94> in <module>
      3                     arg_max_iterations, arg_initial_const, arg_gamma, None,
      4                     arg_attr_reg, arg_attr_penalty_reg,
----> 5                     arg_latent_square_loss_reg)
      6 
      7 print(info_pn)

c:\workspaces\aix360\aix360\algorithms\contrastive\CEM_MAF.py in explain_instance(self, sess, input_img, input_latent, arg_mode, arg_kappa, arg_binary_search_steps, arg_max_iterations, arg_initial_const, arg_gamma, arg_beta, arg_attr_reg, arg_attr_penalty_reg, arg_latent_square_loss_reg)
     95                             attr_penalty_reg=arg_attr_penalty_reg, latent_square_loss_reg=arg_latent_square_loss_reg)
     96 
---> 97             adv_img = attack_pn.attack(input_img, target_label, input_latent)
     98             adv_prob, adv_class, adv_prob_str = self._wbmodel.predict_long(adv_img)
     99             attr_mod = self.check_attributes_celebA(self._attributes, input_img, adv_img)

c:\workspaces\aix360\aix360\algorithms\contrastive\CEM_MAF_aen_PN.py in attack(self, imgs, labs, latent)
    268                 # perform the attack
    269                 
--> 270                 self.sess.run([self.train])
    271                 temp_adv_latent = self.sess.run(self.adv_latent)
    272                 self.sess.run(self.adv_updater, feed_dict={self.assign_adv_latent: temp_adv_latent})

...

InvalidArgumentError: Conv2DCustomBackpropInputOp only supports NHWC.
	 [[node gradients/G_paper_1_1/cond/ToRGB_lod8/Conv2D_grad/Conv2DBackpropInput (defined at c:\workspaces\aix360\aix360\algorithms\contrastive\CEM_MAF_aen_PN.py:197) ]]

Errors may have originated from an input operation.
Input Source operations connected to node gradients/G_paper_1_1/cond/ToRGB_lod8/Conv2D_grad/Conv2DBackpropInput:
 G_paper_1_1/cond/ToRGB_lod8/mul (defined at <string>:27)  

My setup:

I tried this example twice. Once on a windows machine (CPU only) and on a linux machine (CPU only). Both systems error out at the same step. The installation of aix360 worked both times according to the setup instructions in the git documentation.

My hypothesis:

I am thinking that the pickled CelebA model (karras2018iclr-celebahq-1024x1024.pkl) is the cause of this error.
Maybe the problem lies with the requirements. AIX360 needs tensorflow=1.14.0 whereas progressive_growing_of_gans requires tensorflow-gpu>=1.6.0.

I would really appreciate it, if you could help me out on this, as I want to know, if it's a model problem, which I can fix with my own models in the future, or if it's something more complicated than that.

Thank you very much in advance!

Model Types

I have gone through the HELOC.ipynb file. I can't find documentation anywhere that about what types of models can be used other than Neural Networks.

What model types are supported with aix360?

Thanks!

LightGBM

Hi,

I identified ways to use LightGBM with many of your tools. Let me know if it is something you would like to incorporate into your package. Maybe we can house a tutorial on the repo AIX360 that can show people how to use your tools with LightGBM as I had to develop a few work-around strategies.

https://github.com/firmai/ml-fairness-framework

Best,
Derek

Error while running copied code from "Credit Approval Tutorial" BRCG part

Hi there,

I've actually copied the code (did no modification at all) from the BRCG part of the "Credit Approval Tutorial" code and ran into errors. I'm quite sure that the dataset was loaded appropriately, as I have also trained a scikit learn Decision Tree Classifier on it with no problem and in the same notebook.

Can someone help me with this issue? Am I missing something or is it an internal problem?

Thanks in advance!

Here is the code and the output.
It was run on google colab, with pandas 1.1.2 and the latest aix360 release, which is 0.2.0.

Copied code

import warnings
warnings.filterwarnings('ignore')

# Load FICO HELOC data with special values converted to np.nan
from aix360.datasets.heloc_dataset import HELOCDataset, nan_preprocessing
data = HELOCDataset(custom_preprocessing=nan_preprocessing).data()
# Separate target variable
y = data.pop('RiskPerformance')

# Split data into training and test sets using fixed random seed
from sklearn.model_selection import train_test_split
dfTrain, dfTest, yTrain, yTest = train_test_split(data, y, random_state=0, stratify=y)
dfTrain.head().transpose()

# Binarize data and also return standardized ordinal features
from aix360.algorithms.rbm import FeatureBinarizer
fb = FeatureBinarizer(negations=True, returnOrd=True)
dfTrain, dfTrainStd = fb.fit_transform(dfTrain)
dfTest, dfTestStd = fb.transform(dfTest)
dfTrain['ExternalRiskEstimate'].head()

# Instantiate BRCG with small complexity penalty and large beam search width
from aix360.algorithms.rbm import BooleanRuleCG
br = BooleanRuleCG(lambda0=1e-3, lambda1=1e-3, CNF=True)

# Train, print, and evaluate model
br.fit(dfTrain, yTrain)
from sklearn.metrics import accuracy_score
print('Training accuracy:', accuracy_score(yTrain, br.predict(dfTrain)))
print('Test accuracy:', accuracy_score(yTest, br.predict(dfTest)))
print('Predict Y=0 if ANY of the following rules are satisfied, otherwise Y=1:')
print(br.explain()['rules'])

Output

Learning CNF rule with complexity parameters lambda0=0.001, lambda1=0.001
Initial LP solved
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in __setitem__(self, key, value)
   1001         try:
-> 1002             self._set_with_engine(key, value)
   1003         except (KeyError, ValueError):

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in _set_with_engine(self, key, value)
   1032         # fails with AttributeError for IntervalIndex
-> 1033         loc = self.index._engine.get_loc(key)
   1034         validate_numeric_casting(self.dtype, value)

pandas/_libs/index.pyx in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc()

KeyError: 'ExternalRiskEstimate'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-98-8d81fbd6c0e1> in <module>()
     26 
     27 # Train, print, and evaluate model
---> 28 br.fit(dfTrain, yTrain)
     29 from sklearn.metrics import accuracy_score
     30 print('Training accuracy:', accuracy_score(yTrain, br.predict(dfTrain)))

/usr/local/lib/python3.6/dist-packages/aix360/algorithms/rbm/boolean_rule_cg.py in fit(self, X, y)
    118         UB = min(UB.min(), 0)
    119         v, zNew, Anew = beam_search(r, X, self.lambda0, self.lambda1,
--> 120                                     K=self.K, UB=UB, D=self.D, B=self.B, eps=self.eps)
    121 
    122         while (v < -self.eps).any() and (self.it < self.iterMax):

/usr/local/lib/python3.6/dist-packages/aix360/algorithms/rbm/beam_search.py in beam_search(r, X, lambda0, lambda1, K, UB, D, B, wLB, eps, stopEarly)
    285             if i[1] == '<=':
    286                 thresh = Xp[i[0]].columns.get_level_values(1).to_series().replace('NaN', np.nan)
--> 287                 colKeep[i[0]] = (Xp[i[0]].columns.get_level_values(0) == '>') & (thresh < i[2])
    288             elif i[1] == '>':
    289                 thresh = Xp[i[0]].columns.get_level_values(1).to_series().replace('NaN', np.nan)

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in __setitem__(self, key, value)
   1008             else:
   1009                 # GH#12862 adding an new key to the Series
-> 1010                 self.loc[key] = value
   1011 
   1012         except TypeError as e:

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    668 
    669         iloc = self if self.name == "iloc" else self.obj.iloc
--> 670         iloc._setitem_with_indexer(indexer, value)
    671 
    672     def _validate_key(self, key, axis: int):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
   1790                 # setting for extensionarrays that store dicts. Need to decide
   1791                 # if it's worth supporting that.
-> 1792                 value = self._align_series(indexer, Series(value))
   1793 
   1794             elif isinstance(value, ABCDataFrame):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _align_series(self, indexer, ser, multiindex_indexer)
   1909             # series, so need to broadcast (see GH5206)
   1910             if sum_aligners == self.ndim and all(is_sequence(_) for _ in indexer):
-> 1911                 ser = ser.reindex(obj.axes[0][indexer[0]], copy=True)._values
   1912 
   1913                 # single indexer

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in reindex(self, index, **kwargs)
   4397     )
   4398     def reindex(self, index=None, **kwargs):
-> 4399         return super().reindex(index=index, **kwargs)
   4400 
   4401     def drop(

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in reindex(self, *args, **kwargs)
   4457         # perform the reindex on the axes
   4458         return self._reindex_axes(
-> 4459             axes, level, limit, tolerance, method, fill_value, copy
   4460         ).__finalize__(self, method="reindex")
   4461 

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
   4480                 fill_value=fill_value,
   4481                 copy=copy,
-> 4482                 allow_dups=False,
   4483             )
   4484 

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
   4525                 fill_value=fill_value,
   4526                 allow_dups=allow_dups,
-> 4527                 copy=copy,
   4528             )
   4529             # If we've made a copy once, no need to make another one

/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, consolidate)
   1274         # some axes don't allow reindexing with dups
   1275         if not allow_dups:
-> 1276             self.axes[axis]._can_reindex(indexer)
   1277 
   1278         if axis >= self.ndim:

/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
   3283         # trying to reindex on an axis with duplicates
   3284         if not self.is_unique and len(indexer):
-> 3285             raise ValueError("cannot reindex from a duplicate axis")
   3286 
   3287     def reindex(self, target, method=None, level=None, limit=None, tolerance=None):

ValueError: cannot reindex from a duplicate axis

ProtoDash: local variable 'newinnerProduct' referenced before assignment

I am using the HELOC Dataset and trying to explain a single test instance using prototypes from my training subset using below code:

explainer = ProtodashExplainer()
(W, S, _) = explainer.explain(dfTrain.to_numpy(), dfTest.iloc[0:1,:].to_numpy(), m=2)

However, I am getting below error:
Screen Shot 2020-05-22 at 01 58 42

Is this intentional? Please help.

Thank you

Regarding memory exhaustion with BRCG

I am trying BRCG on my dataset. I have nearly 40 features but most of them are categorical. I have nearly 18k rows or examples of labeled data. After feature binarization, my dataset gets converted into 35 thousand features and now my overall dimension of data is (18K,35k). Now, I run BRCG on this, but it throws memory error as shown below.
image
The issue might be due to memory allocation in cvxpy module. Any help in resolving this issue will be highly appreciated. I am using 32GB RAM.

ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.

I'm working through the Credit Approval Tutorial, 3. Loan Officer: Prototypical explanations for HELOC use case.

When I run the line of code:

(Data, x_train, x_test, y_train_b, y_test_b) = heloc.split()

I get the error:

ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.

Any idea on how I can get around this error?

torch version issue

Running the HELOC notebook produces this error:
module 'torch.jit' has no attribute '_script_if_tracing'
which is resolved by using a different version of pytorch
!pip install torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

setup.py may need an update to use the correct version of torch and torchvision

set.py gives error that ' no commands supplied' on windows

(aix360) C:\Users\ESISARP\AIX360>python setup.py

C:\Users\ESISARP\AppData\Local\Continuum\anaconda3\envs\aix360\lib\distutils\dist.py:261: UserWarning: Unknown distribution option: 'authos'
warnings.warn(msg)
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help

error: no commands supplied

Protodash Array Size

The following does not work:

explainer.explain(X= [500,000,:64], Y= [50,000,:64], m=60,000 ) 
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 3545 and the array at index 1 has size 3544

If we purely want to select from X I don't see why when m>Y.shape[0], there should be a problem?

image

Also in the heloc example, you take S from Y, how does that makes sense with the above description.

image

image

Also if I had the ordering wrong, I also switched them around to no effect,

explainer.explain(X= [50,000,:64]  , Y= [500,000,:64], m=60,000 ) 
    273                 [newCurrOptw, value] = runOptimiser(currK, curru, currOptw, maxGradient)
--> 274                 newCurrSetValue = -value
    275 
    276         currOptw = newCurrOptw

TypeError: bad operand type for unary -: 'NoneType'

Here is a reproducible example to play with - maybe about 5 mins to run. (reset kernel after installs)

https://colab.research.google.com/drive/1FdafzzZku0RgJEk7Zf_rJ1lLuciRF0YD

Thanks.

Integration of LIME and SHAP in AIX360

The goal of integrating LIME & SHAP is to provide users with an option of using any explainer of their choice for a given use case and to be able to compare different explainability algorithms, simply by installing aix360 instead of having to install multiple libraries.

The integration task would involve the following steps:
(1) Update setup to include lime and shap installs.
(2) Create tests that invoke lime and shap explainers and link these to travis.yml
(3) Write appropriate wrappers around these explainers so users have an option to invoke them in the same manner as other explainers available in aix360.
(4) Create notebooks to illustrate their usage.
(5) Update docs.

References:
LIME: https://github.com/marcotcr/lime
SHAP: https://github.com/slundberg/shap

CEMExplainer for pertinent positive returning no changes

I'm trying to explain a Cifar10 classification model with CEMExplainer. By messing around with the parameters I got a result for the pertinent negative, but I've been having trouble with getting any results for pertinent positives... What am I doing wrong?

mymodel = KerasClassifier(model)
explainer = CEMExplainer(mymodel)

img = x_test[0]
arg_max_iter = 1000 
arg_init_const = 10.0 
arg_b = 9 
arg_kappa = 0.05 
arg_beta = 1e-1 
arg_gamma = 100 

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  arg_mode = "PP"  # Find pertinent positive
  (adv_pp, delta_pp, info_pp) = explainer.explain_instance(np.expand_dims(img, axis=0), arg_mode, ae, arg_kappa, arg_b, arg_max_iter, arg_init_const, arg_beta, arg_gamma)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.