US 11,790,036 B2
Bias mitigating machine learning training system
Xinmin Wu, Cary, NC (US); Xin Jiang Hunt, Cary, NC (US); and Ralph Walter Abbey, Cary, NC (US)
Assigned to SAS Institute Inc., Cary, NC (US)
Filed by SAS Institute Inc., Cary, NC (US)
Filed on Nov. 2, 2022, as Appl. No. 18/51,906.
Application 18/051,906 is a continuation in part of application No. 17/837,444, filed on Jun. 10, 2022, granted, now 11,531,845.
Application 17/837,444 is a continuation in part of application No. 17/557,298, filed on Dec. 21, 2021, granted, now 11,436,444.
Claims priority of provisional application 63/272,980, filed on Oct. 28, 2021.
Claims priority of provisional application 63/272,980, filed on Oct. 28, 2021.
Claims priority of provisional application 63/252,918, filed on Oct. 6, 2021.
Claims priority of provisional application 63/252,918, filed on Oct. 6, 2021.
Prior Publication US 2023/0205839 A1, Jun. 29, 2023
Int. Cl. G06F 17/16 (2006.01); G06F 17/18 (2006.01); G06N 3/08 (2023.01)
CPC G06F 17/16 (2013.01) [G06F 17/18 (2013.01); G06N 3/08 (2013.01)] 30 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to:
train a prediction model with a plurality of observation vectors, wherein each observation vector of the plurality of observation vectors includes a target variable value of a target variable, a sensitive attribute variable value of a sensitive attribute variable, and a plurality of attribute variable values of a plurality of attribute variables;
define a predicted target variable by predicting a second target variable value for each observation vector of the plurality of observation vectors using the trained prediction model;
initialize a bound value using a predefined bound value;
(A) initialize a number of iterations;
(B) assign a weight value to each observation vector of the plurality of observation vectors based on the predicted second target variable value and the sensitive attribute variable value of each respective observation vector of the plurality of observation vectors and on fairness constraints defined for an equalized odds fairness measure type;
(C) train the prediction model with each observation vector of the plurality of observation vectors weighted by a respective assigned weight value;
(D) update the predicted target variable by predicting the second target variable value for each observation vector of the plurality of observation vectors using the prediction model trained in (C);
(E) compute a true conditional moments matrix and a false conditional moments matrix based on the fairness constraints and the second target variable value predicted in (D) and the sensitive attribute variable value of each respective observation vector of the plurality of observation vectors, wherein the true conditional moments matrix is associated with a true positive rate (TPR), and the false, conditional moments matrix is associated with a false positive rate (FPR);
(F) increment the initialized number of iterations;
(G) repeat (B) through (F) until a predefined number of bound test update iterations is performed based on the incremented number of iterations;
(H) when the computed conditional moments matrix indicates to adjust the bound value, update the bound value based on an upper bound value or a lower bound value, and repeat (A) through (G) with the bound value replaced with the updated bound value until the computed conditional moments matrix indicates no further adjustment of the bound value is needed;
(I) train a fair prediction model with the updated bound value computed in (H); and
output the trained fair prediction model.