tabular_trees.XGBoostTabularTrees

class tabular_trees.XGBoostTabularTrees(data)[source]

Bases: BaseModelTabularTrees

Class to hold the XGBoost trees in tabular format.

The preferred way to create XGBoostTabularTrees objects is with the from_booster method.

__init__(data)

Methods

__init__(data)

derive_predictions(df, lambda_)

Derive predictons for internal nodes in trees.

from_booster(booster)

Create XGBoostTabularTrees from a xgb.Booster object.

to_dataframe()

Return data for trees object.

to_tabular_trees()

Convert the tree data to a TabularTrees object.

Attributes

data

Tree data.

Tree

Tree number.

Node

Node number.

ID

Id for ech node combining tree and node numbers.

Feature

The name of the feature split on.

Split

The split point for a node.

Yes

Left child node.

No

Right child node.

Missing

Child node for rows with null values in the split feature.

Gain

Gain for a given split.

Cover

Related to the 2nd order derivative of the loss function with respect to a the split feature.

Category

G

Use in calculation of internal node predictions.

H

Cover.

weight

Node prediction.

Cover

Related to the 2nd order derivative of the loss function with respect to a the split feature.

Feature

The name of the feature split on.

Null for leaf nodes.

G

Use in calculation of internal node predictions.

Gain

Gain for a given split.

H

Cover.

ID

Id for ech node combining tree and node numbers.

Missing

Child node for rows with null values in the split feature.

No

Right child node.

Null for leaf nodes.

Node

Node number.

Split

The split point for a node.

Null for leaf nodes.

Tree

Tree number.

Yes

Left child node.

Null for leaf nodes.

data

Tree data.

static derive_predictions(df, lambda_)[source]

Derive predictons for internal nodes in trees.

Predictions will be available in ‘weight’ column in the output.

Returns:

trees – Tree data with ‘weight’, ‘H’ and ‘G’ columns added.

Return type:

pd.DataFrame

classmethod from_booster(booster)[source]

Create XGBoostTabularTrees from a xgb.Booster object.

Parameters:

booster (xgb.Booster) – XGBoost model to pull tree data from.

Returns:

trees – Model trees in tabular format.

Return type:

XGBoostTabularTrees

Examples

>>> import xgboost as xgb
>>> from sklearn.datasets import load_diabetes
>>> from tabular_trees import XGBoostTabularTrees
>>> # get data in DMatrix
>>> diabetes = load_diabetes()
>>> data = xgb.DMatrix(diabetes["data"], label=diabetes["target"])
>>> # build model
>>> params = {"max_depth": 3, "verbosity": 0}
>>> model = xgb.train(params, dtrain=data, num_boost_round=10)
>>> # export to XGBoostTabularTrees
>>> xgboost_tabular_trees = XGBoostTabularTrees.from_booster(model)
>>> type(xgboost_tabular_trees)
<class 'tabular_trees.xgboost.xgboost_tabular_trees.XGBoostTabularTrees'>
to_dataframe()

Return data for trees object.

Returns:

trees – Model trees in DataFrame form.

Return type:

pd.DataFrame

to_tabular_trees()[source]

Convert the tree data to a TabularTrees object.

Returns:

trees – Model trees in TabularTrees form.

Return type:

TabularTrees

weight

Node prediction.