CPC G06V 30/413 (2022.01) [G06V 10/467 (2022.01); G06V 10/761 (2022.01); G06V 10/774 (2022.01); G06V 30/412 (2022.01)] | 15 Claims |
1. A method for determining authenticity of a document comprising the steps of:
storing images of authentic documents in an image database as subsets of images defined based on the similarity of text content between the images;
receiving, by an electronic device, an image of a document;
assigning a label to the received image;
obtaining a low dimensionality vector for each image in one of the subsets, wherein each image in the one subset is encoded into the respective low dimensionality vector using a trained machine learning modelis and is assigned the same label as the received image;
encoding the received image into a low dimensionality vector;
calculating a distance between the low dimensionality vector of the received image and each obtained low dimensionality vector, each distance represents the similarity in appearance between the received image and a respective image in the one subset, wherein the smaller the distance the greater the similarity between the images;
comparing each of the calculated distances against a threshold distance;
calculating a number of the calculated distances that are less than or equal to the threshold distance;
in response to determining the calculated number is at least equal to a required number, determining the document in the received image is authentic; and
in response to determining the calculated number is less than the required number, determining the received image requires manual review.
|