US 12,443,855 B2
Optimizing cascade of classifiers schema using genetic search
Andrey Finkelshtein, Beer Sheva (IL); Eitan Menahem, Beer Sheva (IL); Yuval Margalit, Ramat-Gan (IL); and Sarit Hollander, Ramat Gan (IL)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Mar. 21, 2022, as Appl. No. 17/699,547.
Prior Publication US 2023/0297848 A1, Sep. 21, 2023
Int. Cl. G06N 3/126 (2023.01); G06N 20/20 (2019.01)
CPC G06N 3/126 (2013.01) [G06N 20/20 (2019.01)] 19 Claims
OG exemplary drawing
 
1. A method for optimizing an ensemble of cascaded classifiers for a task of classification of a plurality of observations, each to a class from a plurality of classes, the method comprising:
in each of a plurality of iterations:
computing a set of scores, each associated with one of a set of ensembles of classification parameters, each ensemble of classification parameters characterizing an ensemble of cascaded classifiers for execution by at least one hardware processor, each ensemble of classification parameters comprising:
a first set of classifier parameters, characterizing a first cascaded classifier from the ensemble of cascaded classifiers, wherein the first cascaded classifier is a linear regression classifier;
a second set of classifier parameters, characterizing a second cascaded classifier from the ensemble of cascaded classifiers, wherein the second cascaded classifier is a first CatBoost classifier having a first number of trees of a first depth;
a third set of classifier parameters, characterizing a third cascaded classifier from the ensemble of cascaded classifiers, wherein the third cascaded classifier is a second CatBoost classifier having a second number of trees of a second depth, wherein the second number of trees is greater than the first number of trees and the second depth is greater than the first depth; and
a sequence of confidence thresholds used to determine when to execute the second cascaded classifier and the third cascaded classifier by at least one hardware processor, using a confidence measure computed by the first cascaded classifier and the second cascaded classifier, wherein the first cascaded classifier has a first upper classification boundary and a first lower classification boundary, wherein the second cascaded classifier has a second upper classification boundary and a second lower classification boundary, wherein a range between the first upper classification boundary and the first lower classification boundary is greater than a range between the second upper classification boundary and the second lower classification boundary;
aggregating a plurality of new ensembles of classification parameters and associated scores from the set of scores by applying a genetic algorithm to the set of ensembles of classification parameters and the set of scores, into a pool of ensembles and associated scores; and
using the pool of ensembles of classification parameters in a consecutive iteration of the plurality of iterations; and
identifying a preferred ensemble of classification parameters, in the pool of ensembles and associated scores, using a score associated with each ensemble of classification parameters in the pool of ensembles and associated scores.