| CPC G06V 30/416 (2022.01) [G06F 18/2113 (2023.01); G06N 3/088 (2013.01); G06V 30/19013 (2022.01)] | 15 Claims |

|
1. A method comprising:
for each extraction model of a plurality of extraction models,
generating, by the extraction model, a plurality of key identifier, key value pairs for the extraction model for a possible training document image, and
comparing the plurality of key identifier, key value pairs for the extraction model with a plurality of key identifier, key value pairs in ground truth information defined for the possible training document image to obtain a combined accuracy level for the extraction model;
aggregating the combined accuracy level across the plurality of extraction models to obtain an oracle accuracy for the possible training document image;
selecting the possible training document image as a training document image when the oracle accuracy is greater than an accuracy threshold;
executing an anomaly detection model on a first plurality of features obtained for the training document image to obtain reconstructed input;
determining a reconstruction error based at least in part on the reconstructed input;
updating a plurality of weights of the anomaly detection model using the reconstruction error;
extracting a plurality of image features of a document image of a document, wherein extracting the plurality of image features comprises:
executing a convolutional neural network-based architecture on the document image to obtain an image embedding model feature, wherein the plurality of image features comprises the image embedding model feature;
executing an optical character recognition (OCR) engine on the document image to obtain OCR output;
extracting a plurality of OCR features from the OCR output;
executing, after updating the plurality of weights, the anomaly detection model using a second plurality of features comprising the plurality of OCR features and the plurality of image features to generate an anomaly score; and
presenting the anomaly score.
|