US 11,966,826 B2
	Bias reduction in machine learning model training and inference
Christopher Lam, Apex, NC (US)
Assigned to Epistamai LLC, Apex, NC (US)
Filed by Epistamai LLC, Apex, NC (US)
Filed on Nov. 14, 2022, as Appl. No. 18/055,031.
Application 18/055,031 is a continuation of application No. 18/051,134, filed on Oct. 31, 2022.
Claims priority of provisional application 63/365,905, filed on Jun. 6, 2022.
Prior Publication US 2023/0394358 A1, Dec. 7, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 20/00 (2019.01)

CPC G06N 20/00 (2019.01)

19 Claims

1. A method of training and applying a prediction model comprising:

determining a trained prediction model by training a prediction model based on training data including a plurality of training observations, each of the plurality of training observations including a respective plurality of training data values corresponding with a plurality of features, each of the plurality of training observations also including a respective target value, each of the plurality of training observations also including a respective protected attribute value corresponding with a protected attribute feature selected from the group consisting of: race, ethnicity, sex, gender, national origin, religion, disability status, age, genetic information, marital status, and receipt of public assistance, wherein determining the trained prediction model includes:

determining one or more default protected attribute values for the prediction model,

determining an overlap profile between the protected attribute feature and a designated feature of the plurality of features, the overlap profile indicating a respective degree of overlap among the plurality of training observations between first selected values corresponding to the protected attribute feature and second selected values corresponding to the designated feature,

determining based on the overlap profile that a designated one of the respective degrees of overlap indicates a positivity violation, and

identifying one or more value replacement rules for correcting the positivity violation by replacing a feature value;

receiving via a communication interface a request to determine a designated predicted target value for a designated inference observation after determining the one or more default protected attribute values, the designated inference observation including a designated plurality of inference data values corresponding with the plurality of features;

updating the designated inference observation in memory to include a replacement data value determined based on the one or more value replacement rules and a designated default protected attribute value of the one or more default protected attribute values;

determining the designated predicted target value via a processor by applying the prediction model to the updated designated inference observation including the replacement data value and the designated default protected attribute value; and

storing the predicted target value on a storage device.