US 11,914,567 B2
Text-based machine learning extraction of table data from a read-only document
Hongyang Yu, Wentworth Point (AU); Hanieh Borhanazad, Sydney (AU); and Sandip Mandlecha, Pune (IN)
Assigned to Coupa Software Incorporated, San Mateo, CA (US)
Filed by COUPA SOFTWARE INCORPORATED, San Mateo, CA (US)
Filed on Oct. 25, 2022, as Appl. No. 17/973,511.
Application 17/973,511 is a continuation of application No. 17/074,957, filed on Oct. 20, 2020, granted, now 11,500,843.
Claims priority of application No. 202011037847 (IN), filed on Sep. 2, 2020.
Prior Publication US 2023/0049389 A1, Feb. 16, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/22 (2019.01); G06F 16/93 (2019.01); G06N 3/04 (2023.01); G06K 9/62 (2022.01); G06V 30/412 (2022.01); G06V 30/414 (2022.01); G06F 18/213 (2023.01); G06N 3/045 (2023.01)
CPC G06F 16/2282 (2019.01) [G06F 16/93 (2019.01); G06F 18/213 (2023.01); G06N 3/045 (2023.01); G06V 30/412 (2022.01); G06V 30/414 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
extracting text rectangle data from a digital electronic document;
converting the text rectangle data to a feature map that indicates text rectangle-level numerical data and spatial locations of text rectangles in the document;
by at least one convolutional neural network, processing the text rectangle-level numerical data to produce table-level numerical data including a spatial location of a table portion of the document, probabilities of text rectangles belonging to row canonicals, and probabilities of text rectangles belonging to column canonicals;
formatting and storing the table-level numerical data in a searchable, editable data record;
wherein the method is performed by one or more computing devices.