US 12,481,912 B2
	Machine learning model bias detection
Elhanan Mishraky, Nofim (IL); Aviv Ben Arie, Ramat Gan (IL); Natalie Grace De Shetler, Los Angeles, CA (US); Shir Meir Lador, Givat Shmuel (IL); and Yair Horesh, Kfar-Saba (IL)
Filed by INTUIT INC., Mountain View, CA (US)
Filed on Apr. 30, 2021, as Appl. No. 17/245,122.
Prior Publication US 2022/0351068 A1, Nov. 3, 2022
Int. Cl. G06N 20/00 (2019.01); G06N 5/04 (2023.01)

CPC G06N 20/00 (2019.01) [G06N 5/04 (2013.01)]

17 Claims

1. A method for detecting latent bias in machine learning models, comprising:

receiving a data set comprising features of a plurality of individuals;

receiving identifying information comprising an email or a username for each individual of the plurality of individuals;

automatically extracting a name from the email or the username for each individual of the plurality of individuals based on applying a regular expression to the email or the username;

predicting, for each respective individual of the plurality of individuals, a probability that the respective individual has a given protected attribute based on a stored association between the automatically extracted name for the respective individual and the given protected attribute;

providing, as inputs to a machine learning model, the features of the plurality of individuals from the data set, wherein the machine learning model has been trained using a supervised learning process comprising determining an accuracy of the machine learning model, by comparing output predictions generated in response to training inputs provided to the machine learning model to labels associated with the training inputs, and iteratively adjusting parameters of the machine learning model based on the comparing until one or more conditions are met;

receiving outputs from the machine learning model in response to the inputs;

determining whether the machine learning model is biased against the given protected attribute based on the outputs and the probability that each respective individual of the plurality of individuals has the given protected attribute, wherein the determining comprises comparing the outputs to a threshold value and evaluating the outputs based on labels associated with false positive and false negative results; and

generating a report based on the determining, wherein the report comprises an indication of whether the machine learning model is biased against the given protected attribute.