US 12,093,826 B2
Cutoff value optimization for bias mitigating machine learning training system with multi-class target
Xinmin Wu, Cary, NC (US); Ricky Dee Tharrington, Jr., Fuquay-Varina, NC (US); Ralph Walter Abbey, Cary, NC (US); and Xin Jiang Hunt, Cary, NC (US)
Assigned to SAS Institute Inc., Cary, NC (US)
Filed by SAS Institute Inc., Cary, NC (US)
Filed on Feb. 19, 2024, as Appl. No. 18/444,906.
Application 18/444,906 is a continuation in part of application No. 18/208,455, filed on Jun. 12, 2023, granted, now 11,922,311.
Application 18/208,455 is a continuation in part of application No. 18/051,906, filed on Nov. 2, 2022, granted, now 11,790,036.
Application 18/051,906 is a continuation in part of application No. 17/837,444, filed on Jun. 10, 2022, granted, now 11,531,845.
Application 17/837,444 is a continuation in part of application No. 17/557,298, filed on Dec. 21, 2021, granted, now 11,436,444.
Claims priority of provisional application 63/453,689, filed on Mar. 21, 2023.
Claims priority of provisional application 63/272,980, filed on Oct. 28, 2021.
Claims priority of provisional application 63/252,918, filed on Oct. 6, 2021.
Prior Publication US 2024/0193416 A1, Jun. 13, 2024
Int. Cl. G06N 3/08 (2023.01); G06N 5/022 (2023.01)
CPC G06N 3/08 (2013.01) [G06N 5/022 (2013.01)] 30 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to:
(A) train a prediction model with a plurality of observation vectors, wherein each observation vector of the plurality of observation vectors includes a target variable value of a target variable, a sensitive attribute variable value of a sensitive attribute variable, and an attribute variable value for each attribute variable of a plurality of attribute variables, wherein the target variable has at least three possible unique values, wherein a predefined target event value indicates one of the at least three possible unique values of the target variable;
(B) execute the trained prediction model to define a predicted target variable value and a probability associated with an accuracy of the defined predicted target variable value for each observation vector of the plurality of observation vectors;
(C) compute a conditional moments matrix based on fairness constraints defined based on a fairness measure type, the predicted target variable value, and the sensitive attribute variable value of each respective observation vector of the plurality of observation vectors, wherein the predicted target variable value is identified as having the predefined target event value only when the probability associated with the accuracy is greater than a predefined event cutoff value;
(D) repeat (A) through (C) to train a fair prediction model to reduce a bias value in predicting the target variable value based on the computed conditional moments matrix until a first stop criterion indicates retraining of the fair prediction model is complete;
(E) compute an updated value for the predefined event cutoff value;
(F) repeat (A) through (E) with the updated value for the predefined event cutoff value as the predefined event cutoff value until a second stop criterion indicates updating the predefined event cutoff value is complete;
define an optimal event cutoff value from the predefined event cutoff values used when repeating (A) through (E); and
output the defined optimal event cutoff value and the trained fair prediction model trained using the defined optimal event cutoff value.