US 12,259,932 B2
Focused URL recrawl
Lei Zhang, San Jose, CA (US); Lin Xu, Saratoga, CA (US); Seokkyung Chung, San Jose, CA (US); and Xunhua Tong, Fremont, CA (US)
Assigned to Palo Alto Networks, Inc., Santa Clara, CA (US)
Filed by Palo Alto Networks, Inc., Santa Clara, CA (US)
Filed on Aug. 3, 2021, as Appl. No. 17/393,129.
Application 17/393,129 is a continuation of application No. 15/445,550, filed on Feb. 28, 2017, granted, now 11,216,513.
Prior Publication US 2021/0365503 A1, Nov. 25, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/951 (2019.01); G06F 16/28 (2019.01); G06F 16/335 (2019.01); G06F 16/353 (2025.01); G06F 16/955 (2019.01)
CPC G06F 16/951 (2019.01) [G06F 16/285 (2019.01); G06F 16/335 (2019.01); G06F 16/353 (2019.01); G06F 16/9566 (2019.01)] 19 Claims
OG exemplary drawing
 
1. A system, comprising:
a processor configured to:
in response to receiving a website misclassification report comprising a user-reported indication that access to a first URL, having an associated first domain, is erroneously blocked, wherein the first URL was previously assigned a categorization associated with at least a first subject matter topic based on content analysis performed by an original classification model, use a single page classifier to perform a recrawl-reclassification operation on the first URL using a current classification model that is an updated version of the original classification model, to determine at least one current subject matter topic and assign a current categorization for the first URL; and
determine that there is at least one discrepancy among at least two of: the first categorization, the current categorization, or information included in the misclassification report, and take a remedial action in response to the determination, wherein the remedial action includes at least one of: (1) initiating an escalation event when the first categorization and the result of the recrawl-reclassification operation are in agreement, or (2) assigning the current categorization to the URL when the current categorization is different from the first categorization and initiating a recrawl-reclassification operation on at least one additional domain; and
a memory coupled to the processor and configured to provide the processor with instructions.