US 12,229,691 B2
Uncertainty quantification for machine learning classification modelling
Kamalika Das, Saratoga, CA (US)
Assigned to Intuit Inc., Mountain View, CA (US)
Filed by Intuit, Inc., Mountain View, CA (US)
Filed on Mar. 16, 2023, as Appl. No. 18/122,641.
Prior Publication US 2024/0311665 A1, Sep. 19, 2024
Int. Cl. G06N 7/01 (2023.01); G06N 5/022 (2023.01)
CPC G06N 7/01 (2023.01) [G06N 5/022 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method, comprising:
processing nonlinear input data associated with an electronic data transaction with an ensemble of tree-based nonlinear machine learning models to generate an output at each leaf node of each tree-based nonlinear machine learning model, wherein the output is based on a traversal path of each tree-based nonlinear machine learning model in the ensemble of tree-based nonlinear machine learning models;
generating a high-dimensional embedding based on the output of each leaf node of each tree-based nonlinear machine learning model in the ensemble of tree-based nonlinear machine learning models, wherein the high-dimensional embedding encodes one or more nonlinear features associated with the nonlinear input data traversed in the traversal path of each tree-based nonlinear machine learning model in the ensemble of tree-based nonlinear machine learning models;
projecting the high-dimensional embedding into a lower-dimensional embedding by applying a dimensionality reduction function, wherein:
the dimensionality reduction function is based on a principal component analysis, and
the lower-dimensional embedding comprises a lower-dimensional representation of the one or more nonlinear features;
processing the lower-dimensional embedding with a Bayesian logistic regression machine learning model to generate a binary class prediction associated with the nonlinear input data;
determining a confidence for the binary class prediction with the Bayesian logistic regression machine learning model, wherein the confidence for the binary class prediction is based on a credible interval of the binary class prediction and the nonlinear input data; outputting:
the binary class prediction if the confidence is greater than or equal to a threshold; or
a flipped binary class prediction if the confidence is lower than the threshold; and
authorizing the electronic data transaction based on the binary class prediction or the flipped binary class prediction.