US 12,033,412 B2
Systems and methods for extracting information from a physical document
Rakesh Iyer, Santa Clara, CA (US); and Lisha Ruan, New York, NY (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Appl. No. 17/291,647
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Jan. 28, 2019, PCT No. PCT/US2019/015335
§ 371(c)(1), (2) Date May 6, 2021,
PCT Pub. No. WO2020/096635, PCT Pub. Date May 14, 2020.
Claims priority of provisional application 62/756,262, filed on Nov. 6, 2018.
Prior Publication US 2021/0406451 A1, Dec. 30, 2021
Int. Cl. G06V 30/40 (2022.01); G06F 40/169 (2020.01); G06V 30/10 (2022.01)
CPC G06V 30/40 (2022.01) [G06F 40/169 (2020.01); G06V 30/10 (2022.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method for extracting information from documents, the method comprising:
obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document;
determining, by the computing system, one or more annotated values from the one or more units of text;
determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document;
wherein determining, by the computing system, the label for each annotated value comprises:
determining, by the computing system based on the search, a set of one or more candidate labels for each annotated value, wherein preference is given to candidate labels that satisfy relative location characteristics, wherein the relative location characteristics comprise:
inclusion in left-side region or a top-side region relative to the location associated with the annotated value in a coordinate space of the document when a language associated with the document is a Left-to-Right (LTR) language, or
inclusion in a right-side region or the top-side region relative to the location associated with the annotated value in the coordinate space of the document when the language associated with the document is a Right-to-Left (RTL) language; and
determining, by the computing system, a canonical label for each annotated value based at least in part on the set of one or more candidate labels associated with the annotated value; and
mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.