US 12,456,061 B2
Self-monitoring cognitive bias mitigator in predictive systems
Debasis Ganguly, Dublin (IE)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Dec. 23, 2020, as Appl. No. 17/132,786.
Prior Publication US 2022/0198297 A1, Jun. 23, 2022
Int. Cl. G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06N 5/04 (2013.01) [G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a memory that stores computer executable components; and
a processor that executes at least one of the computer executable components that:
receives a set of data comprising data items, a primary classification task associated with the data items, primary task labels associated with the primary classification task, secondary-identity attributes associated with the data items, and respective categories for the secondary-identity attributes, wherein a machine learning model is trained to perform the primary classification task with primary task variables, and wherein the machine learning model has a structure comprising one or more layers associated with the primary classification task;
trains, using the set of data, the machine learning model to generate predictions with respect to the primary classification task associated with labeling the data items respectively with the primary task labels, and mitigate bias from the predictions, wherein the training comprises:
generating, using the machine learning model and the set of data, one or more of the predictions for the primary classification task associated with labeling one or more of the data items respectively with the primary task labels,
clustering, using a defined clustering process, the data items into clusters, wherein each cluster is associated with a distinct combination of the secondary-identity attributes and the respective categories of the secondary-identity attributes,
identifying one or more secondary-identity attributes that have bias based on a function of non-uniformity in distribution of respective posteriors of the clusters and respective mappings of the one or more of the predictions to the clusters,
generating one or more pseudo-task variables respectively associated with the one or more secondary-identity attributes that have the bias,
generating a pseudo-bias classification task associated with the one or more pseudo-task variables for mitigating the bias from the predictions,
modifying the machine learning model to concurrently perform the primary classification task and the pseudo-bias classification task, wherein the modifying comprise changing the structure of the machine learning model to comprise a first portion of the machine learning model associated with the primary classification task and primary task variables of the primary classification task, and second portion of the machine learning model associated with the pseudo-bias classification task and the one or more pseudo-task variables, and
training, using a multi-objective loss function, the machine learning model to mitigate the bias from the predictions by concurrently generating, using the machine learning model, for each data item of the set of data:
a prediction with respect to the primary classification task associated with labeling the data item with one of the primary task labels at above a first defined performance threshold of the multi-objective loss function that increases classification effectiveness of the machine learning model for the primary classification task, and
one or more additional predictions for the data item with respect to the pseudo-bias classification task associated with the one or more pseudo-task variables at below a second defined performance threshold of the multi-objective loss function that reduces classification effectiveness of the machine learning model for the pseudo-bias classification task, wherein generating the one or more additional predictions at below the second defined performance threshold trains the machine learning model to mitigate the bias in the predictions associated with the one or more secondary-identity attributes determined to have the bias.