US 12,236,183 B2
	Methods and apparatus for selecting, high lighting and/or processing, text included in a PDF document
Shayne Fitzgerald, Austin, TX (US); Charlie Davis, Tampa, FL (US); Harry Arnold Epperson, IV, Tampa, FL (US); Alexandra Debish, Tampa, FL (US); Kiril Vatev, Brooklyn, NY (US); Stephen C. Brooks, Sparta, NC (US); Zach Roach, Auburndale, FL (US); Kiefer Sivitz, Tampa, FL (US); and Cody Owens, Vancouver, WA (US)
Assigned to Accusoft Corporation, Tampa, FL (US)
Filed by Accusoft Corporation, Tampa, FL (US)
Filed on Jan. 26, 2024, as Appl. No. 18/424,718.
Application 18/424,718 is a continuation of application No. 17/896,695, filed on Aug. 26, 2022, abandoned.
Prior Publication US 2024/0169144 A1, May 23, 2024
Int. Cl. G06F 40/166 (2020.01); G06F 40/106 (2020.01); G06F 40/14 (2020.01); G06F 40/154 (2020.01)

CPC G06F 40/166 (2020.01) [G06F 40/106 (2020.01); G06F 40/14 (2020.01); G06F 40/154 (2020.01)]

14 Claims

1. A method of operating a device including a processor and display, the method comprising:

performing an extracted text element to a Document Object Model (DOM) element matching operation to identify individual text and DOM elements corresponding to a Portable Document Format (PDF) page of a document which match, said matching operation matching a first extracted text element to a first DOM element;

generating a first synthesized text element including information from said first extracted text element and said first DOM element;

storing said first synthesized text element in a data structure based on information indicating the position at which text included in the first synthesized text element is to be displayed in a rendered image of said PDF page, storing said first synthesized text element in a data structure including storing said first synthesized text element as a node in a k-dimensional (K-D) tree;

receiving information indicating a selection start point in a rendered image of the PDF page and a selecting stopping point in the rendered image of the PDF page; and

using the information in the first synthesized text element to determine which text was selected on said PDF page by a user selection operation, said step of using the information in the first synthesized text element to determine which text was selected on said PDF page including accessing the K-D tree to identify synthetic text elements having rendered image locations which fall fully or partially within a rectangular selection bounding box defined by said selection starting point and said selection stopping point.