US 12,462,090 B2
Automatically extracting tabular data included within a source document
Badri Nath, Edison, NJ (US); Vijayendra Mysore Shamanna, Cupertino, CA (US); Shaik Kamran Moinuddin, Bangalore (IN); Henry Thomas Peter, Mountain House, CA (US); and Simha Sadasiva, San Jose, CA (US)
Assigned to Ushur, Inc., Santa Clara, CA (US)
Filed by Ushur, Inc., Santa Clara, CA (US)
Filed on Jul. 28, 2023, as Appl. No. 18/361,700.
Prior Publication US 2025/0036851 A1, Jan. 30, 2025
Int. Cl. G06F 40/103 (2020.01); G06F 16/3329 (2025.01); G06F 40/177 (2020.01); G06V 30/412 (2022.01); G06V 30/413 (2022.01)
CPC G06F 40/103 (2020.01) [G06F 16/3329 (2019.01); G06F 40/177 (2020.01); G06V 30/412 (2022.01); G06V 30/413 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for extracting tabular data included in a source document, the method comprising:
receiving the source document as an input to a document classifier;
receiving a set of desired keywords provided by a business enterprise;
determining, by the document classifier and in response to receiving the source document, a type of the source document;
identifying, based on the determined type of the source document, a plurality of regions containing the tabular data in the source document, wherein the plurality of regions comprises at least a first region that includes one or more extracted headers and at least a second region that includes values corresponding to the one or more extracted headers;
augmenting the one or more extracted headers and the values with spatial words that describe spatial relationship between the extracted headers and the values;
using a natural language model to answer queries formulated using the spatial words and the augmented one or more extracted headers;
associating values with respective extracted headers using the answers to the queries to generate an output; and
formatting the output, wherein the formatted output presents values associated with one or more desired keywords from the set of desired keywords provided by the business enterprise.