US 12,333,844 B2
	Extracting document hierarchy using a multimodal, layer-wise link prediction neural network
Vlad Morariu, Potomac, MD (US); Puneet Mathur, College Park, MD (US); Rajiv Jain, Vienna, VA (US); Ashutosh Mehra, Noida (IN); Jiuxiang Gu, College Park, MD (US); Franck Dernoncourt, Sunnyvale, CA (US); Anandhavelu N, Kangayam (IN); Quan Tran, San Jose, CA (US); Verena Kaynig-Fittkau, Cambridge, MA (US); Nedim Lipka, Santa Clara, CA (US); and Ani Nenkova, Philadelphia, PA (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on Nov. 15, 2022, as Appl. No. 18/055,752.
Prior Publication US 2024/0161529 A1, May 16, 2024
Int. Cl. G06V 30/413 (2022.01); G06V 10/82 (2022.01)

CPC G06V 30/413 (2022.01) [G06V 10/82 (2022.01)]

20 Claims

1. A method comprising:

generating feature embeddings from visual elements of a digital document image;

generating a digital document hierarchy comprising layers of parent-child element relationships from the visual elements by, for a layer of the layers:

determining, from the visual elements, child visual elements and candidate parent visual elements for the child visual elements;

generating, from the feature embeddings utilizing a neural network, element classifications for the candidate parent visual elements and parent-child element link probabilities for the candidate parent visual elements and the child visual elements; and

selecting, from the candidate parent visual elements and based on the parent-child element link probabilities, parent visual elements for the child visual elements; and

utilizing the digital document hierarchy to generate an interactive digital document from the digital document image.