US 12,348,560 B2
	Detecting phishing PDFs with an image-based deep learning approach
Min Du, Santa Clara, CA (US); Hao Huang, San Jose, CA (US); Curtis Leland Carmony, Albuquerque, NM (US); Wenjun Hu, Santa Clara, CA (US); Daniel Raygoza, Seattle, WA (US); Tyler Pals Halfpop, Carmel, IN (US); Jeff White, Tampa, FL (US); and Esmid Idrizovic, Trumau (AT)
Assigned to Palo Alto Networks, Inc., Santa Clara, CA (US)
Filed by Palo Alto Networks, Inc., Santa Clara, CA (US)
Filed on May 2, 2022, as Appl. No. 17/734,956.
Claims priority of provisional application 63/334,574, filed on Apr. 25, 2022.
Prior Publication US 2023/0344867 A1, Oct. 26, 2023
Int. Cl. H04L 9/40 (2022.01); G06F 21/57 (2013.01)

CPC H04L 63/1483 (2013.01) [G06F 21/577 (2013.01)]

23 Claims

1. A system, comprising:

a processor configured to:

receive a Portable Document Format (PDF) document in response to a determination having been made that the PDF document includes at least one clickable link to a Uniform Resource Locator (URL);

determine a likelihood that the received PDF document represents a phishing threat, at least in part, using an image-based model that was previously trained, at least in part, using a plurality of images that were generated using one or more tools that collectively convert a set of PDF given document files to the plurality of images, wherein at least one given document file has a ground truth label of being a phishing PDF; and

provide as output a verdict for the PDF document based at least in part on the determined likelihood, wherein the verdict is usable by a security appliance to take a remedial action associated with the received PDF document; and

a memory coupled to the processor and configured to provide the processor with instructions.