| CPC G06Q 40/08 (2013.01) [G06F 16/20 (2019.01); G06F 18/214 (2023.01); G06F 18/24323 (2023.01); G06N 5/01 (2023.01); G06N 20/00 (2019.01); G06N 5/00 (2013.01)] | 20 Claims |

|
1. An apparatus comprising at least one processor and at least one non-transitory computer readable storage medium storing instructions that, with the at least one processor, configure the apparatus to:
generate a gradient boosted tree model based at least in part on a plurality of data records, wherein each data record of the plurality of data records comprises a plurality of predictor variables, a plurality of corresponding predictor variable values, a dependent variable, and a corresponding dependent variable value;
wherein generating the gradient boosted tree model comprises:
forming a first decision tree structure having a maximum tree depth of one (1);
determining whether first variation associated with main effects of the plurality of predictor variables on the dependent variable remains in a model residual;
responsive to determining that the first variation associated with main effects of the plurality of predictor variables on the dependent variable remains in a model residual,
iteratively forming a first plurality of decision tree structures each having a maximum tree depth of one (1) until the first variation associated with main effects in the model residual is exhausted, wherein a main effect represents an effect of a given predictor variable of the plurality of predictor variables on the dependent variable while ignoring all other predictor variables of the plurality of predictor variables, and wherein after forming each decision tree structure of the first plurality of decision tree structures, a determination is made to form a next decision tree structure of the first plurality of decision tree structures when the first variation associated with main effects remains in the model residual:
forming a second plurality of decision tree structures each having a maximum tree depth of two (2);
determining whether second variation with interaction effects between the plurality of predictor variables remains in the model residual; and
responsive to determining that the second variation associated with interaction effects between the plurality of predictor variables remains in the model residual,
iteratively forming successive pluralities of decision tree structures each having a maximum tree depth increased by one (1) as compared to an immediately preceding plurality of decision tree structures, each successive plurality of decision tree structures comprising a number of decision tree structures necessary to exhaust all interaction effects between the plurality of predictor variables, wherein an interaction effect represents an effect of a given predictor variable of the plurality of predictor variables on the dependent variable taking into consideration one or more other predictor variables of the plurality of predictor variables, and wherein, after forming each decision tree structure of the successive pluralities of decision tree structures, a determination is made to form a next decision tree structure of the successive pluralities of decision tree structures when the second variation associated with interaction effects remains in the model residual:
generate, based on a combination of a first subset of a plurality of indicator variables associated with one or more of the second plurality of decision tree structures or the successive pluralities of decision tree structures, a generalized linear model structure definition upon which the dependent variable depends; and
generate one or more of a prediction of loss frequency or a prediction of loss severity based at least in part on the generalized linear model structure definition.
|