US 12,073,171 B2
System and method for extracting website characteristics
Anthony Christopher Orciuoli, Redwood City, CA (US); Jacob Kuramoto, Portland, OR (US); and Mark Vilrokx, San Mateo, CA (US)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Mar. 18, 2021, as Appl. No. 17/205,166.
Application 17/205,166 is a continuation of application No. 15/969,532, filed on May 2, 2018, granted, now 10,984,166.
Claims priority of provisional application 62/566,082, filed on Sep. 29, 2017.
Prior Publication US 2021/0200931 A1, Jul. 1, 2021
Int. Cl. G06F 40/103 (2020.01); G06F 16/957 (2019.01); G06F 16/958 (2019.01); G06T 11/00 (2006.01); H04L 67/02 (2022.01)
CPC G06F 40/103 (2020.01) [G06F 16/957 (2019.01); G06F 16/958 (2019.01); G06T 11/001 (2013.01); H04L 67/02 (2013.01)] 24 Claims
OG exemplary drawing
 
1. One or more non-transitory computer-readable media storing instructions, which when executed by one or more hardware processors, cause performance of operations comprising:
identifying a plurality of webpages within a website;
identifying a first thematic characteristic detected on the plurality of webpages based at least on determining that a first number of webpages on which a first website characteristic is detected is two or more;
identifying the first thematic characteristic as a logo at least by performing operations comprising:
identifying a plurality of images among the plurality of webpages;
comparing a first size of a first image among the plurality of images to a threshold;
based on determining the first size of the first image exceeds the threshold: selecting the first image as a candidate for the logo;
comparing a second size of a second image among the plurality of images to the threshold;
based on determining the second size of the second image does not exceed the threshold: rejecting the second image as a candidate for the logo; and
based on selecting the first image as the candidate for the logo and rejecting the second image as the candidate for the logo: identifying the first image as the logo.