| CPC G06Q 30/0205 (2013.01) [G06Q 20/401 (2013.01); G06Q 20/4014 (2013.01); G06Q 30/0204 (2013.01); G06Q 30/0229 (2013.01); G06Q 30/0269 (2013.01); G06Q 30/0631 (2013.01); H04L 67/306 (2013.01)] | 23 Claims |

|
1. A method comprising:
receiving a request, from a requesting entity, for a user identity at an identity management platform associated with an enterprise, the identity management platform maintaining an identity graph of a plurality of users, wherein the user identity platform includes a random forest classifier model;
receiving a set of training data including transaction data, pairs of known correlated user profile nodes, and pairs of known non-correlated nodes;
generating a set of values based on the training data to define account to transaction correlations for each of a plurality of pairs of store and transaction data collections, wherein the set of values are one or more of a binary, integer, numerical, or string value and each collection represents a separate user profile node;
training the random forest classifier model with the set of values to enable the random forest classifier to identify a likely correlation across two accounts in response to receiving an identification of two nodes or two sets of transaction data, wherein the random forest classifier model is configured to generate a plurality of parallel probability analyses and includes an aggregation layer configured to output a probability score representing a normalized likelihood of similarity between two nodes;
identifying, in response to the request, using at least the trained random forest classifier model, a user cluster within the identity graph associated with a user identifiable via the request, the user cluster including one or more user profile nodes, wherein the user cluster includes a plurality of edge connections including at least one identity edge connection linking between two user profile nodes of the user cluster, each of the user profile nodes being associated with a user account established with the enterprise and having a node confidence associated therewith;
identifying, using at least the trained random forest classifier model, at least one of the one or more user profile nodes based on whether a cluster edge confidence associated with the one or more user profile nodes included within the user cluster meets a threshold confidence level, the threshold confidence level being based, at least in part, on the request, and the cluster edge confidence being based in part on the node confidence; and
transmitting, to the requesting entity, an identification of the at least one user profile node that meets the threshold confidence level.
|