CPC H04L 63/102 (2013.01) [G06N 20/00 (2019.01)] | 15 Claims |
1. A method, performed by a computer system, for automatically classifying user accounts in an entity's IT network, wherein the user accounts are classified using identity management key-value pairs from an identity management data structure, the method comprising:
training a statistical model to map individual identity management key-value pairs or sets of identity management key-value pairs to a probability of being associated with a service user account, wherein a key in the identity management key-value pair is a textual string that represents a field in a directory, maintained by an identity management system, comprising one or more accounts on the entity's IT network, wherein a value in the identity management key-value pair is a corresponding value to the field in the directory, and wherein the statistical model is trained using a set of inputs and a target variable and wherein training the model comprises:
parsing account data from an output text file stored in or hosted on the identity management system associated with user accounts manually classified as the service user accounts or human user accounts to obtain dynamically-specified identity management key-value pairs that are used as the inputs in the statistical model, and
setting the target variable in the statistical model to be whether the user account is a service user account;
using machine-learning-based modeling to automatically determine whether an unclassified user account is a service user account by performing the following:
identifying identity management key-value pairs, from the identity management system, associated with the unclassified user account,
representing the unclassified user account as an N-dimensional vector of the identity management key-value pairs, wherein N is the number of the identity management key-value pairs associated with the unclassified user account,
inputting the N-dimensional vector into the statistical model to calculate a probability that the unclassified user account is a service user account, and
in response to the probability exceeding a threshold, classifying the unclassified user account as a service user account;
using account classification results from the machine-learning-based modeling to construct and evaluate context-specific rules, wherein the context-specific rules identify one or more user accounts that are classified as service user account(s) but are known in the system to be human user account(s), wherein for the one or more user accounts that are classified as service user account(s) but are known in the system to be human user account(s), performing the following steps:
identifying a probability score associated with an equal error rate (EER), wherein the EER is the rate at which false positives equal false negatives,
setting the threshold to the probability score associated with the EER, and
in response to the probability exceeding the threshold, classifying the human user account(s) as service user account(s); and
using the context-specific rules to improve security analytics alert accuracy in an IT network.
|