US 11,657,222 B1
Confidence calibration using pseudo-accuracy
Peter Caton Anthony, Menlo Park, CA (US)
Assigned to Intuit Inc., Mountain View, CA (US)
Filed by Intuit Inc., Mountain View, CA (US)
Filed on Nov. 23, 2022, as Appl. No. 17/993,089.
Int. Cl. G06F 40/216 (2020.01); G06F 40/279 (2020.01); G06N 3/08 (2023.01)
CPC G06F 40/216 (2020.01) [G06F 40/279 (2020.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of training a machine learning model to predict pseudo-accuracies for strings extracted by a document extraction model, the method performed by an electronic device coupled to the machine learning model and comprising:
receiving a plurality of first outputs from the document extraction model and a corresponding ground truth value associated with each first output, each first output of the plurality of first outputs comprising an extracted string and a raw confidence score;
determining, for each first output of the plurality of first outputs, an accuracy metric based at least in part on the extracted string and the ground truth value associated with the respective first output;
for each extracted string of the plurality of first outputs:
determining a similarity metric between the respective extracted string and each other extracted string of the plurality of first outputs; and
determining a pseudo-accuracy for the respective extracted string based at least in part on the determined similarity metrics and the determined accuracy metrics;
generating training data based at least in part on the determined pseudo-accuracies and the plurality of first outputs; and
training the machine learning model, based on the training data, to predict pseudo-accuracies associated with subsequent outputs from the document extraction model.