| CPC G06V 30/412 (2022.01) [G06F 40/177 (2020.01); G06V 30/413 (2022.01); G06V 30/416 (2022.01)] | 20 Claims |

|
1. A computerized method for extracting borderless checkbox tables from electronics documents, the method comprising:
detecting, by a server computing device, a plurality of checkboxes in a textual electronic document;
extracting, by the server computing device, a plurality of text blocks from the textual electronic document;
identifying, by the server computing device, one or more table headers corresponding to at least one borderless checkbox table in the textual electronic document based on the text blocks;
determining, by the server computing device, a table boundary corresponding to the at least one borderless checkbox table based on the table headers;
identifying, by the server computing device, a plurality of table rows and a plurality of table columns corresponding to the at least one borderless checkbox table based on the table boundary and the plurality of checkboxes;
identifying, by the server computing device, a plurality of table cells corresponding to the at least one borderless checkbox table based on the plurality of table rows and the plurality of table columns; and
generating, by the server computing device, a data structure comprising data representing the at least one borderless checkbox table based on at least the plurality of table cells and the plurality of checkboxes.
|