tabular_trees.decompose_prediction
- tabular_trees.decompose_prediction(tabular_trees, row)[source]
Decompose prediction from tree based model with Saabas method[1].
This method attributes the change in prediction from moving to a lower node to the variable that was split on. This can then be summed over all splits in a tree and all trees in a model.
- Parameters:
tabular_trees (TabularTrees) – Tree based model to explain prediction for.
row (pd.DataFrame) – Single row of data to explain prediction from tabular_trees object.
- Returns:
results – Prediction decomposed into change attributed to each feature.
- Return type:
Notes
[1] Saabas, Ando (2014) ‘Interpreting random forests’, Diving into data blog, 19 October. Available at http://blog.datadive.net/interpreting-random-forests/ (Accessed 26 February 2023).
Examples
>>> import xgboost as xgb >>> import pandas as pd >>> from sklearn.datasets import load_diabetes >>> from tabular_trees import export_tree_data >>> from tabular_trees import decompose_prediction >>> # get data in DMatrix >>> diabetes = load_diabetes() >>> data = xgb.DMatrix( ... diabetes["data"], ... label=diabetes["target"], ... feature_names=diabetes["feature_names"] ... ) >>> # build model >>> params = {"max_depth": 3, "verbosity": 0} >>> model = xgb.train(params, dtrain=data, num_boost_round=10) >>> # export to TabularTrees >>> xgboost_tabular_trees = export_tree_data(model) >>> tabular_trees = xgboost_tabular_trees.to_tabular_trees() >>> # get data to score >>> scoring_data = pd.DataFrame(diabetes["data"], columns=diabetes["feature_names"]) >>> row_to_score = scoring_data.iloc[[0]] >>> # decompose prediction >>> results = decompose_prediction(tabular_trees, row=row_to_score) >>> type(results) <class 'tabular_trees.explain.prediction_decomposition.PredictionDecomposition'>