US 12,175,344 B2
Enforcing fairness on unlabeled data to improve modeling performance
Michael Louis Wick, Burlington, MA (US); Swetasudha Panda, Burlington, MA (US); and Jean-Baptiste Frederic George Tristan, Burlington, MA (US)
Assigned to Oracle International Corporation, Redwood City, CA (US)
Filed by Oracle International Corporation, Redwood City, CA (US)
Filed on Aug. 22, 2023, as Appl. No. 18/453,929.
Application 18/453,929 is a continuation of application No. 16/781,945, filed on Feb. 4, 2020, granted, now 11,775,863.
Claims priority of provisional application 62/851,481, filed on May 22, 2019.
Prior Publication US 2023/0394371 A1, Dec. 7, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 20/00 (2019.01); G06N 3/088 (2023.01)
CPC G06N 20/00 (2019.01) [G06N 3/088 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
training, by a machine learning system comprising at least one processor and a memory, a classifier that, when applied to one or more data sets determines classifications of one or more labels for the one or more data sets, the training comprising:
labeling unlabeled data according to a specified amount of bias to generate labeled data comprising the specified amount of bias, wherein the specified amount of bias comprises one or more of a specified amount of label bias and a specified amount of selection bias in at least one dimension of a plurality of label dimensions;
generating a training data set comprising samples of the labeled data and additional unlabeled data; and
training the classifier according to the generated training data set and training parameters comprising an indication the specified amount of bias.