Coder Social home page Coder Social logo

Comments (6)

mglowacki100 avatar mglowacki100 commented on May 27, 2024 2

I'm not sure if autogluon creates those names, but if you look for quick-fix, you need to one-hot encode categorical variables by yourself:

def dummification(df, col):
  dfz = pd.get_dummies(df[col], prefix=col)
  df = df.drop(columns=[col])
  return pd.concat([df, dfz], axis=1)

...
train_data = train_data.drop(columns='education-num') # education_num is just education encoded by 'ordinal'
categorical = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country']

for c in categorical:
  train_data = dummification(train_data, c)

with this:

predictor.print_interpretable_rules(model_name='RuleFit_3')
                                                                                                                                                                          rule  coef
                                                                                                                                                                  capital-gain  0.00
                                                                                                                      capital-gain <= 6571.5 and education_ Prof-school <= 0.5 -0.47
                                                                                                                 capital-gain <= 7073.5 and occupation_ Exec-managerial <= 0.5 -0.44
                                                                                               fnlwgt <= 260314.5 and capital-gain <= 7073.5 and education_ Prof-school <= 0.5 -0.19
                                                                                  capital-gain <= 7268.5 and education_ Bachelors <= 0.5 and occupation_ Prof-specialty <= 0.5 -0.36
                                               capital-gain <= 6571.5 and education_ Bachelors <= 0.5 and occupation_ Prof-specialty <= 0.5 and workclass_ Self-emp-inc <= 0.5 -0.85
                                                                                                                                        age <= 42.5 and capital-gain <= 7073.5 -0.14
                                                                                                                                     age <= 38.5 and education_ Masters <= 0.5 -0.38
                                                                       capital-gain <= 7073.5 and marital-status_ Married-civ-spouse <= 0.5 and workclass_ Self-emp-inc <= 0.5 -0.37
                                                                                             age > 27.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 38.5  0.86
                                                                                 age <= 62.5 and age > 27.5 and marital-status_ Married-civ-spouse > 0.5 and race_ White > 0.5  0.47
                                                               age > 29.5 and education_ HS-grad <= 0.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 33.5  0.03
age > 33.5 and education_ 11th <= 0.5 and capital-gain <= 4782.0 and marital-status_ Married-civ-spouse > 0.5 and occupation_ Farming-fishing <= 0.5 and hours-per-week > 37.5  0.07
                                                                                             age > 42.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 28.5  0.25
              age <= 52.0 and age > 27.5 and fnlwgt > 134350.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 32.5 and occupation_ Machine-op-inspct <= 0.5  0.64
     fnlwgt > 104201.0 and capital-gain <= 7268.5 and marital-status_ Married-civ-spouse > 0.5 and hours-per-week > 35.5 and workclass_ ? <= 0.5 and workclass_ Private <= 0.5  0.17

where for categorical >0.5 means True, <=0.5 means False

from imodels.

mglowacki100 avatar mglowacki100 commented on May 27, 2024 2

Hi @csinva, I see you're autogluon contributor, so two additional things regarding interpretable:

  • GreedyTree, `HiearchicalShrinkageTree' - displays feature_1, feature_2, ... (there is warning: X has feature names but ... was fitted without feature names), I'm not 100% sure but it seems to me that feature_1 is first column in training dataframe, feature_2 second column and so on...
  • BoostedRules doesn't display rules

from imodels.

csinva avatar csinva commented on May 27, 2024

Thanks @mglowacki100! I agee I think one-hot encoding is the best way to go for now.

That feature engineering is performed by autogluon not imodels. There isn't currently support for inverse transforming back to the original features, but we will try and add it soon!

from imodels.

vinay-k12 avatar vinay-k12 commented on May 27, 2024

Thought of that but was thinking that this would increase training time hugely. But anyways, I'll run it on limited features.

from imodels.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.