US 11,861,925 B2
Methods and systems of field detection in a document
Stanislav Semenov, Moscow (RU); and Mikhail Lanin, Moscow (RU)
Assigned to ABBYY Development Inc., Raleigh, NC (US)
Filed by ABBYY Development Inc., Dover, DE (US)
Filed on Dec. 21, 2020, as Appl. No. 17/129,906.
Claims priority of application No. RU2020141790 (RU), filed on Dec. 17, 2020.
Prior Publication US 2022/0198182 A1, Jun. 23, 2022
Int. Cl. G06K 9/00 (2022.01); G06F 40/284 (2020.01); G06K 9/62 (2022.01); G06V 30/412 (2022.01); G06V 30/414 (2022.01); G06F 18/214 (2023.01)
CPC G06V 30/412 (2022.01) [G06F 18/214 (2023.01); G06F 40/284 (2020.01); G06V 30/414 (2022.01)] 21 Claims
OG exemplary drawing
 
1. A method comprising:
receiving a training data set comprising a plurality of document images, wherein each document image of the plurality of document images is associated with respective metadata identifying a document field containing a variable text;
generating, by processing the plurality of document images, a first heat map represented by a data structure comprising a plurality of heat map elements corresponding to a plurality of document image pixels, wherein each heat map element stores a counter of a number of document images in which the document field contains a document image pixel associated with the heat map element;
receiving an input document image; and
identifying, within the input document image, a candidate region comprising the document field, wherein the candidate region comprises a plurality of input document image pixels corresponding to heat map elements satisfying a threshold condition.
 
21. A method comprising:
receiving a training data set comprising a plurality of documents, wherein each document of the plurality of documents is associated with a plurality of user marked field;
for a given field of the plurality of user marked fields in a given document of the plurality of documents, determining whether a particular combination, existing on the given document, of relative positions of additional one or more user marked fields relative to the given field is repeated on one or more additional documents;
responsive to determining that the particular combination is not repeated on any additional documents, designating the given field as being marked incorrectly; and
responsive to determining that the particular combination is repeated on one or more additional documents, determining whether a different combination of relative positions of the additional one or more user marked fields relative to the given field exists on two or more other documents, wherein:
responsive to determining that the different combination does not exist on two or more other documents, designating the given field as being marked correctly; and
responsive to determining that the different combination exists on two or more other documents, designating the given field as being marked inconsistently.