Comments (6)
I'm not sure if autogluon creates those names, but if you look for quick-fix, you need to one-hot encode categorical variables by yourself:
def dummification(df, col):
dfz = pd.get_dummies(df[col], prefix=col)
df = df.drop(columns=[col])
return pd.concat([df, dfz], axis=1)
...
train_data = train_data.drop(columns='education-num') # education_num is just education encoded by 'ordinal'
categorical = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country']
for c in categorical:
train_data = dummification(train_data, c)
with this:
predictor.print_interpretable_rules(model_name='RuleFit_3')
rule coef
capital-gain 0.00
capital-gain <= 6571.5 and education_ Prof-school <= 0.5 -0.47
capital-gain <= 7073.5 and occupation_ Exec-managerial <= 0.5 -0.44
fnlwgt <= 260314.5 and capital-gain <= 7073.5 and education_ Prof-school <= 0.5 -0.19
capital-gain <= 7268.5 and education_ Bachelors <= 0.5 and occupation_ Prof-specialty <= 0.5 -0.36
capital-gain <= 6571.5 and education_ Bachelors <= 0.5 and occupation_ Prof-specialty <= 0.5 and workclass_ Self-emp-inc <= 0.5 -0.85
age <= 42.5 and capital-gain <= 7073.5 -0.14
age <= 38.5 and education_ Masters <= 0.5 -0.38
capital-gain <= 7073.5 and marital-status_ Married-civ-spouse <= 0.5 and workclass_ Self-emp-inc <= 0.5 -0.37
age > 27.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 38.5 0.86
age <= 62.5 and age > 27.5 and marital-status_ Married-civ-spouse > 0.5 and race_ White > 0.5 0.47
age > 29.5 and education_ HS-grad <= 0.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 33.5 0.03
age > 33.5 and education_ 11th <= 0.5 and capital-gain <= 4782.0 and marital-status_ Married-civ-spouse > 0.5 and occupation_ Farming-fishing <= 0.5 and hours-per-week > 37.5 0.07
age > 42.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 28.5 0.25
age <= 52.0 and age > 27.5 and fnlwgt > 134350.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 32.5 and occupation_ Machine-op-inspct <= 0.5 0.64
fnlwgt > 104201.0 and capital-gain <= 7268.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 35.5 and workclass_ ? <= 0.5 and workclass_ Private <= 0.5 0.17
where for categorical >0.5 means True, <=0.5 means False
from imodels.
Hi @csinva, I see you're autogluon contributor, so two additional things regarding interpretable
:
GreedyTree
, `HiearchicalShrinkageTree' - displays feature_1, feature_2, ... (there is warning: X has feature names but ... was fitted without feature names), I'm not 100% sure but it seems to me that feature_1 is first column in training dataframe, feature_2 second column and so on...BoostedRules
doesn't display rules
from imodels.
Thanks @mglowacki100! I agee I think one-hot encoding is the best way to go for now.
That feature engineering is performed by autogluon
not imodels
. There isn't currently support for inverse transforming back to the original features, but we will try and add it soon!
from imodels.
Thought of that but was thinking that this would increase training time hugely. But anyways, I'll run it on limited features.
from imodels.
Related Issues (20)
- Bug in Readme: Hierarchical Shrinkage Doesn't Support HistGradientBoostingRegressor HOT 1
- Deprecation warning coming from setuptools HOT 1
- Missing import in README HOT 1
- GreedyRuleListClassifier has wildly varying performance and sometimes crashes HOT 4
- BoostedRulesClassifier sometimes throws an exception HOT 3
- 'BoostedRulesClassifier' object has no attribute 'complexity_' HOT 6
- Change FIGS predict_proba to use sigmoid?
- Add max_trees hyperparameter for FIGS? HOT 2
- Full sample_weight support for FIGS
- Broken links in docs HOT 1
- HSTree sample weight as positional argument is incompatible with scikit API HOT 3
- FIGS dtreeviz support broken HOT 5
- 'RuleFitRegressor' object has no attribute 'get_rules' HOT 1
- Importing imodels changes default matplotlib plot size config HOT 1
- Link to Skope Rules in README brings elsewhere
- Possible bugs in GreedyRuleListClassifier HOT 3
- Rules list cutoffs are not printed in string representations of GreedyRulesListClassifier HOT 2
- GreedyRulesListClassifier rules don't describe points in all decision regions in some cases HOT 2
- Is library supporting scikit-learn API such as XGBoost or LightGBM compatible to HSTreeClassifier/Regressor?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from imodels.