jjbrophy47 / tree_influence Goto Github PK
View Code? Open in Web Editor NEWInfluence Estimation for Gradient-Boosted Decision Trees
License: Apache License 2.0
Influence Estimation for Gradient-Boosted Decision Trees
License: Apache License 2.0
When trying to run an explainer (I've tried LeafInfluence and BoostIn) on an XGBoost model, I get an error if the model does not have a reg_alpha=0, tree_method='hist' and scale_pos_weight=1. The errors are all arising from assert statements in parser_xgb. Are these necessary, as it would be good to be able to test models with different hyperparameters?
i tried running this -- from tree_influence.explainers import BoostIn
getting said error. I don't know what could be causing this error ? Please help.
The only difference that I see is that - _tree32 is a .c file compared to other .py files. But for this I already have installed cython.
Hello and thanks you for that package.
I came across a problem while trying to use a xgboost model that was trained on dataframe.
So this is my code:
X_train, X_test, y_train, y_test = load_csv('X_train'), load_csv('X_test'), load_csv('y_train'), load_csv('y_test')
model = XGBClassifier(tree_method='hist')
X_train_val, y_train_vals = X_train.values, y_train.values.squeeze()
X_test_val, y_test = X_test.values, y_test.values.squeeze()
model.fit(X_train, y_train)
# fit influence estimator
explainer = BoostIn().fit(model, X_train, y_train)
Which produce this exception:
Traceback (most recent call last):
File "/home/jupyter/owlytics-data-science/influence/influence.py", line 35, in <module>
explainer = BoostIn().fit(model, X_train, y_train)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/boostin.py", line 44, in fit
super().fit(model, X, y)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/base.py", line 31, in fit
self.model_ = parse_model(model, X, y)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/__init__.py", line 33, in parse_model
trees, params = parse_xgb_ensemble(model)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 17, in parse_xgb_ensemble
trees = np.array([_parse_xgb_tree(tree_str) for tree_str in string_data], dtype=np.dtype(object))
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 17, in <listcomp>
trees = np.array([_parse_xgb_tree(tree_str) for tree_str in string_data], dtype=np.dtype(object))
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 88, in _parse_xgb_tree
node_dict = _parse_line(line)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 190, in _parse_line
res['feature'], res['threshold'] = _parse_decision_node_line(line)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 201, in _parse_decision_node_line
feature_ndx = int(feature_str[1:])
ValueError: invalid literal for int() with base 10: 'ecent_beta_blockers_change'
However, When training X_train_val, y_train_val (which is a numpy array) works perfectly good.
It would be great if you could support training with DataFrame as well.
Thanks again!
Hello,
I was using your implementation of BoostIn
to fit my own data, but I came across an error, so I thought it might be due to some inherent inconsistency with my features. However, when fitting it to the iris data provided by the sklearn
package (as cited in your example document in the repository), I came across this very same error:
180 # compute leaf derivative w.r.t. each train example in
leaf_docs
181 numerator = g[leaf_docs, class_idx] + leaf_vals[leaf_idx] * h[leaf_docs, class_idx] # (no. docs,)
--> 182 denominator = np.sum(h[leaf_docs, class_idx]) + l2_leaf_reg
183 leaf_dvs[leaf_docs, boost_idx, class_idx] = numerator / denominator * lr # (no. docs,)
185 # update approximationTypeError: unsupported operand type(s) for +: 'float' and 'NoneType'
Could you please give me some guidance as to what can be going wrong? For context, I am using an XGBoost model here, and I must provide scale_pos_weight=1
in order to avoid having an assertion error. It would be nice if this could be modified as well. Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.