US 11,734,585 B2
Post-hoc improvement of instance-level and group-level prediction metrics
Manish Bhide, Hyderabad (IN); Pranay Lohia, Hyderabad (IN); Karthikeyan Natesan Ramamurthy, Yorktown Heights, NY (US); Ruchir Puri, Yorktown Heights, NY (US); Diptikalyan Saha, Bangalore (IN); and Kush Raj Varshney, Yorktown Heights, NY (US)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Dec. 10, 2018, as Appl. No. 16/214,703.
Prior Publication US 2020/0184350 A1, Jun. 11, 2020
Int. Cl. G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06N 5/04 (2013.01) [G06N 20/00 (2019.01)] 17 Claims
OG exemplary drawing
 
1. A post-processing computer-implemented method for post-hoc improvement of instance-level and group-level prediction metrics, the post-processing method comprising:
training a bias detector on a payload data that learns to detect a sample in a customer model that has an individual bias greater than a predetermined individual bias threshold value with constraints on a group bias, the sample being a member of an unprivileged group, wherein, during the training:
the bias detector perturbs a protected attribute in the payload data for the unprivileged group and computes the individual bias as an individual bias score by finding a difference between a probability of a favorable outcome for the perturbed protected attribute to original data of the payload data;
flagging the unprivileged group samples that have the individual bias greater than the predetermined individual bias threshold value; and
training the bias detector to discriminate between the flagged samples and un-flagged samples;
applying, in a run-time, the bias detector on a run-time sample to select a biased sample in the run-time sample having an individual bias greater than the predetermined individual bias threshold value; and
suggesting, in the run-time, a de-biased prediction for the biased sample by perturbing the protected attribute and checking for bias after perturbation.