US 12,033,412 B2
	Systems and methods for extracting information from a physical document
Rakesh Iyer, Santa Clara, CA (US); and Lisha Ruan, New York, NY (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Appl. No. 17/291,647
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Jan. 28, 2019, PCT No. PCT/US2019/015335 § 371(c)(1), (2) Date May 6, 2021, PCT Pub. No. WO2020/096635, PCT Pub. Date May 14, 2020.
Claims priority of provisional application 62/756,262, filed on Nov. 6, 2018.
Prior Publication US 2021/0406451 A1, Dec. 30, 2021
Int. Cl. G06V 30/40 (2022.01); G06F 40/169 (2020.01); G06V 30/10 (2022.01)

CPC G06V 30/40 (2022.01) [G06F 40/169 (2020.01); G06V 30/10 (2022.01)]

19 Claims

1. A computer-implemented method for extracting information from documents, the method comprising:

obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document;

determining, by the computing system, one or more annotated values from the one or more units of text;

determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document;

wherein determining, by the computing system, the label for each annotated value comprises:

determining, by the computing system based on the search, a set of one or more candidate labels for each annotated value, wherein preference is given to candidate labels that satisfy relative location characteristics, wherein the relative location characteristics comprise:

inclusion in left-side region or a top-side region relative to the location associated with the annotated value in a coordinate space of the document when a language associated with the document is a Left-to-Right (LTR) language, or

inclusion in a right-side region or the top-side region relative to the location associated with the annotated value in the coordinate space of the document when the language associated with the document is a Right-to-Left (RTL) language; and

determining, by the computing system, a canonical label for each annotated value based at least in part on the set of one or more candidate labels associated with the annotated value; and

mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.