CPC G06V 30/412 (2022.01) [G06F 18/214 (2023.01); G06F 40/284 (2020.01); G06V 30/414 (2022.01)] | 21 Claims |
1. A method comprising:
receiving a training data set comprising a plurality of document images, wherein each document image of the plurality of document images is associated with respective metadata identifying a document field containing a variable text;
generating, by processing the plurality of document images, a first heat map represented by a data structure comprising a plurality of heat map elements corresponding to a plurality of document image pixels, wherein each heat map element stores a counter of a number of document images in which the document field contains a document image pixel associated with the heat map element;
receiving an input document image; and
identifying, within the input document image, a candidate region comprising the document field, wherein the candidate region comprises a plurality of input document image pixels corresponding to heat map elements satisfying a threshold condition.
|
21. A method comprising:
receiving a training data set comprising a plurality of documents, wherein each document of the plurality of documents is associated with a plurality of user marked field;
for a given field of the plurality of user marked fields in a given document of the plurality of documents, determining whether a particular combination, existing on the given document, of relative positions of additional one or more user marked fields relative to the given field is repeated on one or more additional documents;
responsive to determining that the particular combination is not repeated on any additional documents, designating the given field as being marked incorrectly; and
responsive to determining that the particular combination is repeated on one or more additional documents, determining whether a different combination of relative positions of the additional one or more user marked fields relative to the given field exists on two or more other documents, wherein:
responsive to determining that the different combination does not exist on two or more other documents, designating the given field as being marked correctly; and
responsive to determining that the different combination exists on two or more other documents, designating the given field as being marked inconsistently.
|