US 12,282,859 B2
Method and system for predicting relevant network relationships
Nicholas Akbar Ablitt, Putney (GB); and James Byron Morris, Raleigh, NC (US)
Assigned to SAS INSTITUTE INC., Cary, NC (US)
Filed by SAS INSTITUTE INC., Cary, NC (US)
Filed on Jul. 25, 2024, as Appl. No. 18/783,592.
Application 18/783,592 is a continuation of application No. 18/777,760, filed on Jul. 19, 2024.
Claims priority of provisional application 63/529,621, filed on Jul. 28, 2023.
Prior Publication US 2025/0036968 A1, Jan. 30, 2025
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 5/01 (2023.01)
CPC G06N 5/01 (2023.01) 28 Claims
OG exemplary drawing
 
1. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, the computer-program product including system instructions operable to cause a computing device to:
obtain a first data set associated with a plurality of nodes to generate one or more sets of networks;
train a first model on the first data set using a first graph to predict relevant links between the plurality of nodes by executing operations comprising:
determine one or more features for one or more links between the plurality of nodes;
determine a target variable indicator for the one or more links between the plurality of nodes using the first graph by executing operations comprising:
determine a set of subgraphs from the first graph;
determine whether each of the one or more links between each node of the plurality of nodes connect within a single subgraph of the set of subgraphs from the first graph;
based on the determination of whether each of the one or more links between each node of the plurality of nodes connect within the single subgraph of the set of subgraphs from the first graph, label the one or more links as intra-community links in the single subgraph of the set of subgraphs from the first graph;
determine whether each of the one or more links between each node of the plurality of nodes connect between at least two subgraphs of the set of subgraphs from the first graph;
based on the determination of whether each of the one or more links between each node of the plurality of nodes connect between the at least two subgraphs of the set of subgraphs from the first graph, label the one or more links as inter-community links in the at least two subgraphs of the set of subgraphs from the first graph;
output the labeled one or more links as the intra-community links in the single subgraph of the set of subgraphs from the first graph; and
output the labeled one or more links as the inter-community links in the at least two subgraphs of the set of subgraphs from the first graph; and
based on the determination of the one or more features and the determination of the target variable indicator for the one or more links between the plurality of nodes using the first graph, train the first model to predict the relevant links of the one or more links between the plurality of nodes, wherein the relevant links comprise the intra-community links;
obtain the first data set or a second data set associated with the plurality of nodes;
for each node of the plurality of nodes from the first data set or the second data set, execute operations comprising:
determine the one or more features for the one or more links between the plurality of nodes;
based on the determination of the one or more features for the one or more links between the plurality of nodes, apply the trained first model to the one or more links between the plurality of nodes;
based on the application of the trained first model to the one or more links between the plurality of nodes, output the relevant links and non-relevant links of the one or more links between the plurality of nodes and output a trained model variable, wherein the non-relevant links comprise the inter-community links;
based on the output of the relevant links and the non-relevant links of the one or more links between the plurality of nodes, remove the non-relevant links of the one or more links between the plurality of nodes;
based on the output of the trained model variable, optimize the application of the trained first model to the one or more links between the plurality of nodes by automatically computing a first threshold for the trained model variable for one or more factors; and
based on the removal of the non-relevant links of the one or more links between the plurality of nodes, connect each node of the plurality of nodes with the relevant links to generate one or more first sets of networks; and
output the one or more first sets of generated networks in a first graphical user interface, as a first input to an automated analytical process, or as a first input to an investigative system.