| CPC G06N 20/00 (2019.01) [H04L 51/42 (2022.05)] | 19 Claims |

|
1. A method of labeling email, the method comprising:
receiving an unlabeled email;
identifying a feature and a second feature of the unlabeled email, wherein the feature and the second feature do not include personally identifiable information (“PII”) in a body of the unlabeled email;
receiving a labeled cluster comprising an email-category label and seed data;
assigning the email-category label to the unlabeled email based on a first derivative edge in an expansion graph thereby creating a labeled email, wherein the first derivative edge is a directional edge from the labeled cluster to the feature and the first derivative edge represents first inference logic that the email-category label associated with the labeled cluster is also associated with the unlabeled email;
including the labeled email in a training dataset;
assigning the email-category label to the second feature based on a clustering edge in the expansion graph, wherein the clustering edge is a directional edge from the feature to the second feature and the clustering edge represents second inference logic that a label associated with the feature is also associated with the second feature;
assigning the email-category label to a second unlabeled email based on a second derivative edge in the expansion graph thereby creating a second labeled email, wherein the second derivative edge is a directional edge from the second feature to the second unlabeled email and the second derivative edge represents third inference logic that the email-category label associated with the second feature is also associated with the second unlabeled email;
including the second labeled email in the training dataset;
training a machine learning model to classify email with the training dataset; and
classifying a received email with the machine learning model as spam; and
quarantining the received email on a server without downloading to a local computer.
|