US 12,315,280 B2
Systems and methods for detection and extraction of borderless checkbox tables
Mohamed Mahdi Alouane, Toronto (CA); Shyam Subramanian, Norwood, MA (US); and Hui Su, West Roxbury, MA (US)
Assigned to FMR LLC, Boston, MA (US)
Filed by FMR LLC, Boston, MA (US)
Filed on Aug. 31, 2022, as Appl. No. 17/900,077.
Prior Publication US 2024/0071119 A1, Feb. 29, 2024
Int. Cl. G06K 9/00 (2022.01); G06F 40/177 (2020.01); G06V 30/412 (2022.01); G06V 30/413 (2022.01); G06V 30/416 (2022.01)
CPC G06V 30/412 (2022.01) [G06F 40/177 (2020.01); G06V 30/413 (2022.01); G06V 30/416 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A computerized method for extracting borderless checkbox tables from electronics documents, the method comprising:
detecting, by a server computing device, a plurality of checkboxes in a textual electronic document;
extracting, by the server computing device, a plurality of text blocks from the textual electronic document;
identifying, by the server computing device, one or more table headers corresponding to at least one borderless checkbox table in the textual electronic document based on the text blocks;
determining, by the server computing device, a table boundary corresponding to the at least one borderless checkbox table based on the table headers;
identifying, by the server computing device, a plurality of table rows and a plurality of table columns corresponding to the at least one borderless checkbox table based on the table boundary and the plurality of checkboxes;
identifying, by the server computing device, a plurality of table cells corresponding to the at least one borderless checkbox table based on the plurality of table rows and the plurality of table columns; and
generating, by the server computing device, a data structure comprising data representing the at least one borderless checkbox table based on at least the plurality of table cells and the plurality of checkboxes.