US 12,254,422 B2
Systems and methods for data correlation and artifact matching in identity management artificial intelligence systems
Mohamed M. Badawy, Round Rock, TX (US); Rajat Kabra, Austin, TX (US); and Jostine Fei Ho, Austin, TX (US)
Assigned to SAILPOINT TECHNOLOGIES, INC., Wilmington, DE (US)
Filed by SailPoint Technologies, Inc., Wilmington, DE (US)
Filed on Mar. 8, 2024, as Appl. No. 18/600,241.
Application 18/600,241 is a continuation of application No. 17/891,639, filed on Aug. 19, 2022, granted, now 11,966,858.
Application 17/891,639 is a continuation of application No. 16/814,291, filed on Mar. 10, 2020, granted, now 11,461,677, issued on Oct. 4, 2022.
Prior Publication US 2024/0211782 A1, Jun. 27, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. H04L 9/40 (2022.01); G06F 16/23 (2019.01); G06F 21/34 (2013.01); G06F 21/45 (2013.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06N 5/04 (2013.01) [G06F 16/2379 (2019.01); G06F 21/34 (2013.01); G06F 21/45 (2013.01); G06N 20/00 (2019.01); H04L 63/0815 (2013.01)] 21 Claims
OG exemplary drawing
 
1. An identity management system, comprising:
a processor;
a non-transitory, computer-readable storage medium, including computer instructions for:
obtaining identity management data associated with a plurality of source systems, the identity management data comprising data on a set of identity management artifacts, wherein the plurality of source systems include a non-authoritative data source and an authoritative data source and the identity management data comprises account data associated with accounts from the non-authoritative data source and identity data on identities associated with the authoritative data source;
determining a first set of identifiers associated with data of the non-authoritative data source;
determining a second set of identifiers associated with data of the authoritative data source;
forming a set of feature pairs specific to the non-authoritative data source and the authoritative data source wherein each feature pair of the set of feature pairs comprises a first identifier from the first set of identifiers and a second identifier from the second set of identifiers and the set of feature pairs are formed by correlating the first set of identifiers with the second set of identifiers;
generating feature values for each of the feature pairs for a set of account-identity pairs, where each account-identity pair comprises a first account of the accounts of the account data associated with the non-authoritative data source and a first identity of the identities of the identity data associated with the authoritative data source, and generating a feature value for a feature pair is based on a first value associated with the first identifier of the feature pair associated with the first account and a second value for the second identifier of the feature pair associated with the first identity; and
generating predictions for one or more account-identity pairs using a machine learning model (ML), wherein a prediction for an account-identity pair is based on the feature values associated with that account-identity pair, wherein when a prediction is over a threshold for the account-identity pair the account of the account-identify pair is associated with the identity of the account-identity pair.