US 12,443,791 B2
Visual analysis for document import
Nathaniel McConathy, Lexington, KY (US)
Assigned to OPEN TEXT HOLDINGS, INC., Menlo Park, CA (US)
Filed by Open Text Holdings, Inc., Menlo Park, CA (US)
Filed on Oct. 3, 2023, as Appl. No. 18/480,282.
Prior Publication US 2025/0111138 A1, Apr. 3, 2025
Int. Cl. G06F 40/00 (2020.01); G06F 40/186 (2020.01); G06V 30/18 (2022.01); G06V 30/412 (2022.01); G06V 30/416 (2022.01)
CPC G06F 40/186 (2020.01) [G06V 30/18105 (2022.01); G06V 30/412 (2022.01); G06V 30/416 (2022.01)] 21 Claims
OG exemplary drawing
 
1. A computer-implemented method for automated visual analysis of documents to generate digital templates, the method comprising:
accessing a digital image of a document page;
accessing a background color definition;
tracking a current content state for analyzing the digital image, the current content state having a plurality of potential states comprising:
a first state indicating a collision with a non-background; and
a second state indicating no collision with the non-background;
testing a first plurality of test lines of pixels from the digital image against the background color definition to identify, from the first plurality of test lines of pixels, a first plurality of content state transition lines, each line in the first plurality of test lines of pixels extending in a first direction;
testing a second plurality of test lines of pixels from the digital image against the background color definition to identify, from the second plurality of test lines of pixels, a second plurality of content state transition lines, each line in the second plurality of test lines of pixels extending in a second direction;
identifying intersections between the first plurality of content state transition lines and the second plurality of content state transition lines;
determining an area of interest bounded by intersecting lines from the first plurality of content state transition lines and the second plurality of content state transition lines;
processing the area of interest to determine that the area of interest represents content; and
based on a determination that the area of interest represents content, storing the area of interest as a design element of a digital page template.