Main Content

Add interaction terms to univariate generalized additive model (GAM)

returns an updated model `UpdatedMdl`

= addInteractions(`Mdl`

,`Interactions`

)`UpdatedMdl`

by adding the interaction terms in
`Interactions`

to the univariate generalized additive model
`Mdl`

. The model `Mdl`

must contain only linear
terms for predictors.

If you want to resume training for the existing terms in `Mdl`

, use
the `resume`

function.

specifies additional options using one or more name-value arguments. For example,
`UpdatedMdl`

= addInteractions(`Mdl`

,`Interactions`

,`Name,Value`

)`'MaxPValue',0.05`

specifies to include only the interaction terms whose
*p*-values are not greater than 0.05.

Train a univariate GAM, which contains linear terms for predictors, and then add interaction terms to the trained model by using the `addInteractions`

function.

Load the `carbig`

data set, which contains measurements of cars made in the 1970s and early 1980s.

`load carbig`

Create a table that contains the predictor variables (`Acceleration`

, `Displacement`

, `Horsepower`

, and `Weight`

) and the response variable (`MPG`

).

tbl = table(Acceleration,Displacement,Horsepower,Weight,MPG);

Train a univariate GAM that contains linear terms for predictors in `tbl`

.

`Mdl = fitrgam(tbl,'MPG');`

Add the five most important interaction terms to the trained model.

UpdatedMdl = addInteractions(Mdl,5);

`Mdl`

is a univariate GAM, and `UpdatedMdl`

is an updated GAM that contains all the terms in `Mdl`

and five additional interaction terms. Display the interaction terms in `UpdatedMdl`

.

UpdatedMdl.Interactions

`ans = `*5×2*
2 3
1 2
3 4
1 4
1 3

Each row of the `Interactions`

property represents one interaction term and contains the column indexes of the predictor variables for the interaction term. You can use the `Interactions`

property to check the interaction terms in the model and the order in which `fitrgam`

adds them to the model.

Train a univariate GAM, which contains linear terms for predictors, and then add interaction terms to the trained model by using the `addInteractions`

function. Specify the `'MaxPValue'`

name-value argument to add interaction terms whose *p*-values are not greater than the `'MaxPValue'`

value.

Load Fisher's iris data set. Create a table that contains observations for versicolor and virginica.

load fisheriris inds = strcmp(species,'versicolor') | strcmp(species,'virginica'); Tbl = array2table(meas(inds,:),'VariableNames',["x1","x2","x3","x4"]); Tbl.Y = species(inds,:);

Train a univariate GAM that contains linear terms for predictors in `Tbl`

.

`Mdl = fitcgam(Tbl,'Y');`

Add important interaction terms to the trained model `Mdl`

. Specify `'all'`

for the `Interactions`

argument, and set the `'MaxPValue'`

name-value argument to 0.05. Among all available interaction terms, `addInteractions`

identifies those whose *p*-values are not greater than the `'MaxPValue'`

value and adds them to the model. The default `'MaxPValue'`

is 1 so that the function adds all specified interaction terms to the model.

UpdatedMdl = addInteractions(Mdl,'all','MaxPValue',0.05); UpdatedMdl.Interactions

`ans = `*5×2*
3 4
2 4
1 4
2 3
1 3

`Mdl`

is a univariate GAM, and `UpdatedMdl`

is an updated GAM that contains all the terms in `Mdl`

and five additional interaction terms. `UpdatedMdl`

includes five of the six available pairs of interaction terms.

`Mdl`

— Generalized additive model`ClassificationGAM`

model object | `RegressionGAM`

model objectGeneralized additive model, specified as a `ClassificationGAM`

or `RegressionGAM`

model object.

`Interactions`

— Number of interaction terms or list of interaction terms`0`

| nonnegative integer | logical matrix | `'all'`

Number or list of interaction terms to include in the candidate set *S*,
specified as a nonnegative integer scalar, a logical matrix, or
`'all'`

.

Number of interaction terms, specified as a nonnegative integer —

*S*includes the specified number of important interaction terms, selected based on the*p*-values of the terms.List of interaction terms, specified as a logical matrix —

*S*includes the terms specified by a`t`

-by-`p`

logical matrix, where`t`

is the number of interaction terms, and`p`

is the number of predictors used to train the model. For example,`logical([1 1 0; 0 1 1])`

represents two pairs of interaction terms: a pair of the first and second predictors, and a pair of the second and third predictors.If

`addInteractions`

uses a subset of input variables as predictors, then the function indexes the predictors using only the subset. That is, the column indexes of the logical matrix do not count the response and observation weight variables. The indexes also do not count any variables not used by the function.`'all'`

—*S*includes all possible pairs of interaction terms, which is`p*(p – 1)/2`

number of terms in total.

Among the interaction terms in *S*, the `addInteractions`

function identifies those whose *p*-values are not greater than the
`'MaxPValue'`

value and uses them to build a set of
interaction trees. Use the default value (`'MaxPValue'`

,1) to
build interaction trees using all terms in *S*.

**Data Types: **`single`

| `double`

| `logical`

| `char`

| `string`

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`addInteractions(``Mdl`

,'all','MaxPValue',0.05,'Verbose',1,'NumPrints',10)

specifies to include all available interaction terms whose `InitialLearnRateForInteractions`

— Learning rate of gradient boosting for interaction terms`1`

(default) | numeric scalar in (0,1]Initial learning rate of gradient boosting for interaction terms, specified as a numeric scalar in the interval (0,1].

For each boosting iteration for interaction trees,
`addInteractions`

starts fitting with the initial learning rate. For
classification, the function halves the learning rate until it finds a rate that
improves the model fit. For regression, the function uses the initial rate throughout
the training.

Training a model using a small learning rate requires more learning iterations, but often achieves better accuracy.

For more details about gradient boosting, see Gradient Boosting Algorithm.

**Example: **`'InitialLearnRateForInteractions',0.1`

**Data Types: **`single`

| `double`

`MaxNumSplitsPerInteraction`

— Maximum number of decision splits per interaction tree4 (default) | positive integer scalar

Maximum number of decision splits (or branch nodes) for each interaction tree (boosted tree for an interaction term), specified as a positive integer scalar.

**Example: **`'MaxNumSplitsPerInteraction',5`

**Data Types: **`single`

| `double`

`MaxPValue`

— Maximum 1 (default) | numeric scalar in [0,1]

Maximum *p*-value for detecting interaction terms, specified as a
numeric scalar in the interval [0,1].

`addInteractions`

first finds the candidate set
*S* of interaction terms from the `Interactions`

value. Then the function identifies the interaction terms whose
*p*-values are not greater than the `'MaxPValue'`

value and uses them to build a set of interaction trees.

The default value (`'MaxPValue',1`

) builds interaction trees for
all interaction terms in the candidate set *S*.

For more details about detecting interaction terms, see Interaction Term Detection.

**Example: **`'MaxPValue',0.05`

**Data Types: **`single`

| `double`

`NumPrint`

— Number of iterations between diagnostic message printouts`Mdl.ModelParameters.NumPrint`

(default) | nonnegative integer scalarNumber of iterations between diagnostic message printouts, specified as a nonnegative integer
scalar. This argument is valid only when you specify `'Verbose'`

as 1.

If you specify `'Verbose',1`

and `'NumPrint',numPrint`

, then
the software displays diagnostic messages every `numPrint`

iterations in the Command Window.

The default value is `Mdl.ModelParameters.NumPrint`

, which is the
`NumPrint`

value that you specify when creating the GAM object
`Mdl`

.

**Example: **`'NumPrint',500`

**Data Types: **`single`

| `double`

`NumTreesPerInteraction`

— Number of trees per interaction term100 (default) | positive integer scalar

Number of trees per interaction term, specified as a positive integer scalar.

The `'NumTreesPerInteraction'`

value is equivalent to the number of
gradient boosting iterations for the interaction terms for predictors. For each
iteration, `addInteractions`

adds a set of interaction trees to the
model, one tree for each interaction term. To learn about the gradient boosting
algorithm, see Gradient Boosting Algorithm.

You can determine whether the fitted model has the specified number of trees by
viewing the diagnostic message displayed when `'Verbose'`

is 1 or 2,
or by checking the `ReasonForTermination`

property value of the model
`Mdl`

.

**Example: **`'NumTreesPerInteraction',500`

**Data Types: **`single`

| `double`

`Verbose`

— Verbosity level`Mdl.ModelParameters.VerbosityLevel`

(default) | 0 | `1`

| `2`

Verbosity level, specified as `0`

, `1`

, or
`2`

. The `Verbose`

value controls the amount of
information that the software displays in the Command Window.

This table summarizes the available verbosity level options.

Value | Description |
---|---|

`0` | The software displays no information. |

`1` | The software displays diagnostic messages every `numPrint` iterations, where
`numPrint` is the `'NumPrint'`
value. |

`2` | The software displays diagnostic messages at every iteration. |

Each line of the diagnostic messages shows the information about each boosting iteration and includes the following columns:

`Type`

— Type of trained trees,`1D`

(predictor trees, or boosted trees for linear terms for predictors) or`2D`

(interaction trees, or boosted trees for interaction terms for predictors)`NumTrees`

— Number of trees per linear term or interaction term that`addInteractions`

added to the model so far`Deviance`

— Deviance of the model`RelTol`

— Relative change of model predictions: $${\left({\widehat{y}}_{k}-{\widehat{y}}_{k-1}\right)}^{\prime}\left({\widehat{y}}_{k}-{\widehat{y}}_{k-1}\right)/{\widehat{y}}_{k}{}^{\prime}{\widehat{y}}_{k}$$, where $${\widehat{y}}_{k}$$ is a column vector of model predictions at iteration*k*`LearnRate`

— Learning rate used for the current iteration

The default value is `Mdl.ModelParameters.VerbosityLevel`

, which is the
`Verbose`

value that you specify when creating the GAM object
`Mdl`

.

**Example: **`'Verbose',1`

**Data Types: **`single`

| `double`

`UpdatedMdl`

— Updated generalized additive model`ClassificationGAM`

model object | `RegressionGAM`

model objectUpdated generalized additive model, returned as a `ClassificationGAM`

or `RegressionGAM`

model object. `UpdatedMdl`

has the same object type as the input
model `Mdl`

.

To overwrite the input argument `Mdl`

, assign the output of
`addInteractions`

to
`Mdl`

:

Mdl = addInteractions(Mdl,Interactions);

Deviance is a generalization of the residual sum of squares. It measures the goodness of fit compared to the saturated model.

The deviance of a fitted model is twice the difference between the loglikelihoods of the model and the saturated model:

-2(log*L* -
log*L _{s}*),

where *L* and
*L _{s}* are the likelihoods of the fitted model and
the saturated model, respectively. The saturated model is the model with the maximum number
of parameters that you can estimate.

`addInteractions`

uses the deviance to measure the goodness of model fit
and finds a learning rate that reduces the deviance at each iteration. Specify
`'Verbose'`

as 1 or 2 to display the deviance and learning rate in
the Command Window.

`addInteractions`

adds sets of interaction trees (boosted trees for
interaction terms for predictors) to a univariate generalized additive model by using a
gradient boosting algorithm (Least-Squares Boosting for regression and Adaptive Logistic Regression for
classification). The algorithm iterates for at most
`'NumTreesPerInteraction'`

times for interaction trees.

For each boosting iteration, `addInteractions`

builds a set of
interaction trees with the initial learning rate
`'InitialLearnRateForInteractions'`

.

When building a set of trees, the function trains one tree at a time. It fits a tree to the residual that is the difference between the response (observed response values for regression or scores of observed classes for classification) and the aggregated prediction from all trees grown previously. To control the boosting learning speed, the function shrinks the tree by the learning rate and then adds the tree to the model and updates the residual.

Updated model = current model + (learning rate)·(new tree)

Updated residual = current residual – (learning rate)·(response explained by new tree)

If adding the set of trees improves the model fit (that is, reduces the deviance of the fit by a value larger than the tolerance), then

`addInteractions`

moves to the next iteration.Otherwise, for classification,

`addInteractions`

halves the learning rate and uses it to update the model and residual. The function continues to halve the learning rate until it finds a rate that improves the model fit. If the function cannot find such a learning rate for interaction trees, then it terminates the model fitting. For regression, if adding the set of trees does not improve the model fit with the initial learning rate, then the function terminates the model fitting.You can determine why training stopped by checking the

`ReasonForTermination`

property of the trained model.

For each pairwise interaction term
*x _{i}*

`Interactions`

), the software performs an
To speed up the process, `addInteractions`

bins numeric predictors into
at most 8 equiprobable bins. The number of bins can be less than 8 if a predictor has fewer
than 8 unique values. The *F*-test examines the null hypothesis that the
bins created by *x _{i}* and

`addInteractions`

builds a set of interaction trees using the terms whose
*p*-values are not greater than the `'MaxPValue'`

value. You can use the default `'MaxPValue'`

value `1`

to build interaction trees using all terms specified by
`Interactions`

.

`addInteractions`

adds interaction terms to the model in the order of
importance based on the *p*-values. Use the `Interactions`

property of the returned model to check the order of the interaction terms added to the
model.

You have a modified version of this example. Do you want to open this example with your edits?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)