| CPC G06N 20/00 (2019.01) [G06F 18/24323 (2023.01); G06F 18/2433 (2023.01)] | 20 Claims |

|
1. A computer-implemented method comprising:
receiving, using a processor, input data comprising time-series data; and
simultaneously training, using a binary mixed-integer linear program of the processor, a network of optimal decision trees (“ODTs”) for regression based on the input data, the network of ODTs configured such that each ODT of the network of ODTs comprises at least one of an upstream ODT and a downstream ODT, wherein an output of an upstream ODT is coupled to an input of a downstream ODT;
wherein during the training of each respective downstream ODT:
a sample, output from a respective upstream ODT, is classified as either an outlier or a point in a distribution according to a minimizing of a nonlinear loss function in which training loss and outlier loss are minimized together, the nonlinear loss function determined according to the following formula:
![]() where zi∈{0, 1} is a selection variable for deciding whether a sample (xi, yi) will be removed or not, α>0 is a weighting parameter to balance between the training error zi(cTxi−yi)2 and the outlier loss
![]() n is a total number of samples, cT is a learned model parameter for a linear regression at a leaf node of the respective ODT, and T represents the transpose of c; and
each sample classified as an outlier is removed from the respective input of the respective downstream ODT, thereby training the respective downstream ODT only on samples that do not contain any outliers; and
controlling a set point for a manufacturing process undergoing an upset condition using the trained network of ODTs, wherein characterization factors of an underlying decision tree of the network of ODTs are given by branching hyperplanes at each branch node and linear regressions at each leaf node.
|