US 11,895,137 B2
Phishing data item clustering and analysis
David Cohen, Mountain View, CA (US); Jason Ma, Mountain View, CA (US); Bing Jie Fu, Redwood City, CA (US); Ilya Nepomnyashchiy, Mountain View, CA (US); Steven Berler, Menlo Park, CA (US); Alex Smaliy, Palo Alto, CA (US); Jack Grossman, Albuquerque, NM (US); James Thompson, London (GB); Julia Boortz, Menlo Park, CA (US); Matthew Sprague, Palo Alto, CA (US); Parvathy Menon, San Jose, CA (US); Michael Kross, Palo Alto, CA (US); Michael Harris, Palo Alto, CA (US); and Adam Borochoff, New York, NY (US)
Assigned to Palantir Technologies Inc., Denver, CO (US)
Filed by Palantir Technologies Inc., Denver, CO (US)
Filed on Dec. 2, 2022, as Appl. No. 18/061,195.
Application 18/061,195 is a continuation of application No. 17/003,398, filed on Aug. 26, 2020, granted, now 11,546,364.
Application 17/003,398 is a continuation of application No. 15/961,431, filed on Apr. 24, 2018, granted, now 10,798,116, issued on Oct. 6, 2020.
Application 15/961,431 is a continuation of application No. 14/487,021, filed on Sep. 15, 2014, granted, now 9,998,485, issued on Jun. 12, 2018.
Application 14/487,021 is a continuation of application No. 14/473,920, filed on Aug. 29, 2014, granted, now 9,965,937, issued on May 8, 2018.
Claims priority of provisional application 62/020,876, filed on Jul. 3, 2014.
Prior Publication US 2023/0096596 A1, Mar. 30, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. H04L 29/06 (2006.01); G06F 21/00 (2013.01); H04L 9/40 (2022.01); G06Q 40/12 (2023.01); G06F 16/28 (2019.01); G06F 21/56 (2013.01); G06F 16/951 (2019.01)
CPC H04L 63/1425 (2013.01) [G06F 16/285 (2019.01); G06Q 40/12 (2013.12); H04L 63/145 (2013.01); H04L 63/1408 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
by one or more hardware computer processors executing code:
communicating with one or more electronic data structures configured to store:
a data clustering strategy; and
a plurality of data items including at least:
a plurality of email data items, each of the plurality of email data items including at least a subject and a sender, each of the plurality of email data items potentially associated with phishing activity; and
a plurality of phishing-related data items related to a communications network of an organization, the plurality of phishing-related data items including at least one of: internal Internet Protocol addresses of the communications network, computerized devices of the communications network, users of particular computerized devices, organizational positions associated with users of particular computerized devices, or URLs and/or external domains visited by users of particular computerized devices;
accessing an email data item transmitted to one or more of the users of respective computerized devices within the network of the organization, the email data item including at least a subject and a sender, the email data item potentially associated with phishing activity;
designating the accessed email data item as a seed; and
generating a data item cluster based on the data clustering strategy by at least:
adding the seed to the data item cluster;
determining the subject and the sender associated with the seed;
identifying one or more of the plurality of email data items having a same subject as the determined subject or a same sender as the determined sender;
adding the identified one or more email data items to the data item cluster;
parsing one or more URLs from the email data items of the data item cluster;
adding the parsed URLs to the data item cluster;
identifying one or more users who are both recipients of at least one of the email data items of the data item cluster and visitors of one of the URLs of the data item cluster;
adding the identified one or more users, including data related to the one or more users, to the data item cluster;
identifying additional one or more data items associated with any data items of the data item cluster; and
adding, to the data item cluster, the additional one or more data items.