US 11,734,939 B2
Vision-based cell structure recognition using hierarchical neural networks and cell boundaries to structure clustering
Xin Ru Wang, San Jose, CA (US); Douglas R. Burdick, San Jose, CA (US); and Xinyi Zheng, Ann Arbor, MI (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Nov. 18, 2021, as Appl. No. 17/529,618.
Application 17/529,618 is a continuation of application No. 16/847,792, filed on Apr. 14, 2020, granted, now 11,222,201.
Prior Publication US 2022/0076012 A1, Mar. 10, 2022
Int. Cl. G06V 30/412 (2022.01); G06T 7/10 (2017.01); G06V 30/416 (2022.01); G06V 30/24 (2022.01); G06V 30/413 (2022.01); G06N 3/045 (2023.01); G06V 10/762 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 10/44 (2022.01)
CPC G06V 30/412 (2022.01) [G06N 3/045 (2023.01); G06T 7/10 (2017.01); G06V 10/454 (2022.01); G06V 10/763 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 30/248 (2022.01); G06V 30/413 (2022.01); G06V 30/416 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06V 30/2528 (2022.01)] 14 Claims
OG exemplary drawing
 
1. A computer-implemented method for use with a given table in a document, the method comprising:
detecting a style of the given table using at least one style classification model, wherein the at least one style classification model comprises at least one deep neural network trained on multiple tables comprising multiple formatting attributes;
selecting, based at least in part on the detected style, a cell detection model appropriate for the detected style;
detecting cells within the given table using the selected cell detection model; and
outputting, to at least one user, information pertaining to the detected cells, the information comprising image coordinates of one or more bounding boxes associated with the detected cells; and
converting at least a portion of the one or more bounding boxes into a logical structure, wherein converting comprises aligning the at least a portion of the one or more bounding boxes to one or more text lines of the given table;
wherein the method is carried out by at least one computing device.