I was using interpretrable models in autogluaon. While the model training was easier b

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Understanding the engineered features in Autogluaon. about imodels HOT 6 OPEN

vinay-k12 commented on May 27, 2024 2

Understanding the engineered features in Autogluaon.

from imodels.

Comments (6)

mglowacki100 commented on May 27, 2024 2

I'm not sure if autogluon creates those names, but if you look for quick-fix, you need to one-hot encode categorical variables by yourself:

def dummification(df, col):
  dfz = pd.get_dummies(df[col], prefix=col)
  df = df.drop(columns=[col])
  return pd.concat([df, dfz], axis=1)

...
train_data = train_data.drop(columns='education-num') # education_num is just education encoded by 'ordinal'
categorical = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country']

for c in categorical:
  train_data = dummification(train_data, c)

with this:

predictor.print_interpretable_rules(model_name='RuleFit_3')
                                                                                                                                                                          rule  coef
                                                                                                                                                                  capital-gain  0.00
                                                                                                                      capital-gain <= 6571.5 and education_ Prof-school <= 0.5 -0.47
                                                                                                                 capital-gain <= 7073.5 and occupation_ Exec-managerial <= 0.5 -0.44
                                                                                               fnlwgt <= 260314.5 and capital-gain <= 7073.5 and education_ Prof-school <= 0.5 -0.19
                                                                                  capital-gain <= 7268.5 and education_ Bachelors <= 0.5 and occupation_ Prof-specialty <= 0.5 -0.36
                                               capital-gain <= 6571.5 and education_ Bachelors <= 0.5 and occupation_ Prof-specialty <= 0.5 and workclass_ Self-emp-inc <= 0.5 -0.85
                                                                                                                                        age <= 42.5 and capital-gain <= 7073.5 -0.14
                                                                                                                                     age <= 38.5 and education_ Masters <= 0.5 -0.38
                                                                       capital-gain <= 7073.5 and marital-status_ Married-civ-spouse <= 0.5 and workclass_ Self-emp-inc <= 0.5 -0.37
                                                                                             age > 27.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 38.5  0.86
                                                                                 age <= 62.5 and age > 27.5 and marital-status_ Married-civ-spouse > 0.5 and race_ White > 0.5  0.47
                                                               age > 29.5 and education_ HS-grad <= 0.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 33.5  0.03
age > 33.5 and education_ 11th <= 0.5 and capital-gain <= 4782.0 and marital-status_ Married-civ-spouse > 0.5 and occupation_ Farming-fishing <= 0.5 and hours-per-week > 37.5  0.07
                                                                                             age > 42.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 28.5  0.25
              age <= 52.0 and age > 27.5 and fnlwgt > 134350.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 32.5 and occupation_ Machine-op-inspct <= 0.5  0.64
     fnlwgt > 104201.0 and capital-gain <= 7268.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 35.5 and workclass_ ? <= 0.5 and workclass_ Private <= 0.5  0.17

where for categorical >0.5 means True, <=0.5 means False

from imodels.

mglowacki100 commented on May 27, 2024 2

Hi @csinva, I see you're autogluon contributor, so two additional things regarding interpretable:

GreedyTree, `HiearchicalShrinkageTree' - displays feature_1, feature_2, ... (there is warning: X has feature names but ... was fitted without feature names), I'm not 100% sure but it seems to me that feature_1 is first column in training dataframe, feature_2 second column and so on...
BoostedRules doesn't display rules

from imodels.

csinva commented on May 27, 2024

Thanks @mglowacki100! I agee I think one-hot encoding is the best way to go for now.

That feature engineering is performed by autogluon not imodels. There isn't currently support for inverse transforming back to the original features, but we will try and add it soon!

from imodels.

vinay-k12 commented on May 27, 2024

Thought of that but was thinking that this would increase training time hugely. But anyways, I'll run it on limited features.

from imodels.

Understanding the engineered features in Autogluaon. about imodels HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent