US 12,456,063 B1
Topological order determination in causal graphs
Xilong Chen, Chapel Hill, NC (US); Sylvie Tchumtchoua Kabisa, Morrisville, NC (US); Dillon Frame, Chapel Hill, NC (US); Ming-Chun Chang, Cary, NC (US); Wanxi Gu, Beijing (CN); Gunce Eryuruk Walton, Raleigh, NC (US); David Bruce Elsheimer, Clayton, NC (US); and Chuan Xu, Morrisville, NC (US)
Assigned to SAS Institute Inc., Cary, NC (US)
Filed by SAS Institute Inc., Cary, NC (US)
Filed on Apr. 10, 2025, as Appl. No. 19/175,902.
Application 19/175,902 is a continuation of application No. 18/947,502, filed on Nov. 14, 2024, granted, now 12,314,874.
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06N 5/04 (2013.01) [G06N 20/00 (2019.01)] 27 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable medium having computer-readable instructions stored thereon that when executed by a processor cause the processor to:
receive input data from one or more network devices as part of a data analytics project for analyzing the input data, wherein the input data is based upon input streaming data or input non-streaming data received from the one or more network devices, wherein the input data comprises a plurality of observation vectors, each of the plurality of observation vectors comprising variable values of a plurality of variables, and wherein each of the plurality of variables is associated with a unique variable index; and
determine at least one of a cause or effect relationship that each variable of the plurality of variables has with one or more other variables of the plurality of variables by generating, using machine learning, a topological order of a directed acyclic graph (DAG) by:
(A) creating a plurality of residual series vectors, each of the plurality of residual series vectors associated with one variable of the plurality of variables;
(B) calculating a normality statistic value for each of the plurality of residual series vectors to obtain a plurality of normality statistic values;
(C) calculating a mean squared error value for each of the plurality of residual series vectors;
(D) comparing each of the plurality of normality statistic values with a predefined critical value;
(E) for each value of the plurality of normality statistic values that is less than or equal to the predefined critical value, adding (a) the variable index of the variable of the plurality of variables associated with the value to an empty temporary order list; and (b) the mean squared error value of the variable of the plurality of variables associated with the value to an empty mean squared error list;
(F) counting a number of elements in the temporary order list;
(G) responsive to determining that the number of elements in the temporary order list is equal to zero, updating an order list based on the plurality of normality statistic values or responsive to determining that the number of elements in the temporary order list is not equal to zero, updating the order list based on at least one of the temporary order list or the mean squared error list;
(H) repeating (A) through (H) a plurality of times; and
(I) outputting the order list from (G) as the topological order of the DAG, wherein the topological order of the DAG is used to analyze the input data by one or more users as part of the data analytics project and transform, using the processor, the input data into output data that is meaningful for the consumption by a particular user of the one or more users, wherein the topological order of the DAG is indicative of the at least one of the cause or effect relationship that each variable of the plurality of variables has with the one or more other variables of the plurality of variables.