US 12,248,747 B2
Device dependent rendering of PDF content
Erik Allan Juhl, København (DK); and Anders Peter Fugmann, Værløse (DK)
Assigned to ISSUU, INC., Palo Alto, CA (US)
Filed by ISSUU, INC., Palo Alto, CA (US)
Filed on Dec. 12, 2023, as Appl. No. 18/537,378.
Application 18/537,378 is a continuation of application No. 17/888,367, filed on Aug. 15, 2022, granted, now 11,842,141.
Application 17/888,367 is a continuation of application No. 17/099,441, filed on Nov. 16, 2020, granted, now 11,416,671, issued on Aug. 16, 2022.
Prior Publication US 2024/0119218 A1, Apr. 11, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 17/00 (2019.01); G06F 40/109 (2020.01); G06F 40/30 (2020.01)
CPC G06F 40/109 (2020.01) [G06F 40/30 (2020.01)] 21 Claims
OG exemplary drawing
 
1. A method of device-dependent display of an article from a PDF file that has multiple columns in at least parts of the article, the method including:
using a library to render the article from the PDF file, including rendering of a plurality of bounding boxes, positioned at on-page coordinates, that contain one or more images and multiple text blocks of glyphs, with font information for the glyphs;
setting a reading order of the article after the rendering, including pulling out text blocks spanning more than half of a width of a page and pulling out images, then reflowing the text blocks to produce the reading order;
merging the text blocks as they appear in the reading order into one or more paragraphs of text using the font information and using starting and ending positions of horizontally arranged text elements in the text blocks to delimit the paragraphs;
inferring semantic information about typographic roles of the paragraphs in the merged text blocks from at least the font information, including font name and font size distribution for sequences of the glyphs; and
causing display of the article in a device-dependent format, including the merged text blocks, using the semantic information and the reading order.