Comments (1)
I further tested it and it also happens for method="genetic"
. It is a bit harder to catch since random_seed = ...
doesn't work for other methods than random
(which is by the way also not documented, so I consider this a bug too). But the method has still some randomness so to find occurrences of this bug I run generate_counterfactuals
multiple times until the bug occurs once:
# Sklearn imports
from sklearn.compose import ColumnTransformer
from sklearn.discriminant_analysis import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
# DiCE imports
import dice_ml
from dice_ml.utils import helpers # helper functions
dataset = helpers.load_adult_income_dataset()
dataset = dataset.sample(1000, random_state=1)
y_train = dataset["income"]
x_train = dataset.drop('income', axis=1)
# Step 1: dice_ml.Data
d = dice_ml.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')
numerical = ["age", "hours_per_week"]
categorical = x_train.columns.difference(numerical)
# We create the preprocessing pipelines for both numeric and categorical data.
numeric_transformer = Pipeline(steps=[("scaler", StandardScaler())])
categorical_transformer = Pipeline(steps=[("onehot", OneHotEncoder(handle_unknown="ignore"))])
transformations = ColumnTransformer(
transformers=[
("num", numeric_transformer, numerical),
("cat", categorical_transformer, categorical),
]
)
# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf = Pipeline(
steps=[("preprocessor", transformations), ("classifier", RandomForestClassifier(random_state=1))]
)
model = clf.fit(x_train, y_train)
# Using sklearn backend
m = dice_ml.Model(model=model, backend="sklearn")
# Using method=random for generating CFs
exp = dice_ml.Dice(d, m, method="genetic")
for i in range(1000):
e1 = exp.generate_counterfactuals(x_train[4:5], total_CFs=10, desired_class="opposite")
print(i)
if (e1.cf_examples_list[0].final_cfs_df["income"].nunique() > 1):
e1.visualize_as_dataframe()
break
If you run this script it will eventually give you some counterfactuals where the class of at least one counterfactual is wrong.
from dice.
Related Issues (20)
- Cannot perform DataFrame operations on generated counterfactuals HOT 2
- ('Feature', ... , 'has a value outside the dataset.') caused by type mismatch HOT 3
- show(shap_local) in .py file
- TypeError: _generate_counterfactuals() got an unexpected keyword argument 'feature_weights' HOT 3
- "ValueError: DataFrame.dtypes for data must be int, float or bool. Did not expect the data types in fields" even for the columns with type as HOT 1
- How to generate CF for three-dimensional dataset
- Unexpected Behavior in Calculating ”feature_weight_list“ leads to abnormal loss?
- TypeError: expected str, bytes or os.PathLike object, not CatBoostRegressor HOT 3
- pandas > 2.0.0 should be supported
- Error when opening the notebook "DiCE_getting_started_feasible.ipynb".
- Permitted range
- Dice Object Initialization Error
- Desired output is 1 and query is the one which has the original output 0. How to select such queries?
- DiCE for Custom Model Input
- Can't import dice_ml because of raiutils lib HOT 2
- Factual presented in explanation is different from original factual HOT 1
- AttributeError: 'PrivateData' object has no attribute 'data_df'
- DiCE_getting_started_feasible notebook typo resulting in failure to render
- Counterfactuals dataframes have rounded target values HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dice.