US 12,277,231 B2
System and method for identifying cyberthreats from unstructured social media content
Daniel Clark Salo, Sunnyvale, CA (US)
Assigned to PROOFPOINT, INC., Sunnyvale, CA (US)
Filed by Proofpoint, Inc., Sunnyvale, CA (US)
Filed on Jan. 22, 2024, as Appl. No. 18/419,118.
Application 18/419,118 is a continuation of application No. 18/169,627, filed on Feb. 15, 2023, granted, now 11,934,535.
Application 18/169,627 is a continuation of application No. 16/823,090, filed on Mar. 18, 2020, granted, now 11,586,739, issued on Feb. 21, 2023.
Claims priority of provisional application 62/955,595, filed on Dec. 31, 2019.
Prior Publication US 2024/0184893 A1, Jun. 6, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/00 (2019.01); G06F 16/338 (2019.01); G06F 16/355 (2025.01); G06F 16/36 (2019.01); G06F 21/57 (2013.01)
CPC G06F 21/577 (2013.01) [G06F 16/338 (2019.01); G06F 16/355 (2019.01); G06F 16/36 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
monitoring, by a cyberthreat detection system, a target of interest in network communications to and from a digital medium, wherein the cyberthreat detection system comprises a rules database, wherein the monitoring comprises processing unstructured content in the network communications, and wherein the processing comprises:
determining, from the unstructured content, content items containing combinations of static keywords, dynamic keywords, or regular expressions that represent the target of interest;
clustering, based on the combinations of the static keywords, the dynamic keywords, or the regular expressions, the content items into clusters;
determining, from the clusters and utilizing vetted cybersecurity phrases, a cluster containing high precision phrases relating to the target of interest; and
updating the rules database utilizing the high precision phrases; and
classifying, utilizing classifier rules stored in the rules database, the unstructured content in the network communications to thereby identify which content items in the unstructured content that refer to the target of interest constitute cyberthreats.