tabular_trees.ParsedXGBoostTabularTrees

class tabular_trees.ParsedXGBoostTabularTrees(data)[source]

Bases: BaseModelTabularTrees

Dataclass for XGBoost models that have been parsed from a model dump.

The preferred way to create ParsedXGBoostTabularTrees objects is with the from_booster method.

__init__(data)

Methods

__init__(data)

from_booster(booster)

Create ParsedXGBoostTabularTrees from a xgb.Booster object.

to_dataframe()

Return data for trees object.

to_xgboost_tabular_trees()

Return the tree structures as XGBoostTabularTrees class.

Attributes

data

Tree data.

tree

Tree index.

depth

Node depth in tree.

nodeid

Node index within tree.

split

Split feature.

split_condition

Split threshold.

yes

Node index for left child.

no

Node index for right child.

missing

Node index for child for rows with null values for split feature.

leaf

Leaf node predictions.

gain

Gain for a split.

cover

Related to the 2nd order derivative of the loss function with respect to a the split feature.

cover

Related to the 2nd order derivative of the loss function with respect to a the split feature.

data

Tree data.

depth

Node depth in tree.

Root nodes have depth 0.

classmethod from_booster(booster)[source]

Create ParsedXGBoostTabularTrees from a xgb.Booster object.

Parameters:

booster (xgb.Booster) – XGBoost model to pull tree data from.

Returns:

trees – Model trees in tabular format.

Return type:

ParsedXGBoostTabularTrees

Examples

>>> import xgboost as xgb
>>> from sklearn.datasets import load_diabetes
>>> from tabular_trees.xgboost.dump_parser import ParsedXGBoostTabularTrees
>>> # get data in DMatrix
>>> diabetes = load_diabetes()
>>> data = xgb.DMatrix(diabetes["data"], label=diabetes["target"])
>>> # build model
>>> params = {"max_depth": 3, "verbosity": 0}
>>> model = xgb.train(params, dtrain=data, num_boost_round=10)
>>> # export to ParsedXGBoostTabularTrees
>>> parsed_xgb_tabular_trees = ParsedXGBoostTabularTrees.from_booster(model)
>>> type(parsed_xgb_tabular_trees)
<class 'tabular_trees.xgboost.dump_parser.ParsedXGBoostTabularTrees'>
gain

Gain for a split.

leaf

Leaf node predictions.

Null for internal nodes.

missing

Node index for child for rows with null values for split feature.

no

Node index for right child.

Null for leaf nodes.

nodeid

Node index within tree.

split

Split feature.

Null for leaf nodes.

split_condition

Split threshold.

Null for leaf nodes.

to_dataframe()

Return data for trees object.

Returns:

trees – Model trees in DataFrame form.

Return type:

pd.DataFrame

to_xgboost_tabular_trees()[source]

Return the tree structures as XGBoostTabularTrees class.

Returns:

trees – Model trees in XGBoostTabularTrees format.

Return type:

XGBoostTabularTrees

tree

Tree index.

yes

Node index for left child.

Null for leaf nodes.