US 11,720,579 B2
Continuous feature-independent determination of features for deviation analysis
Paul O'Hara, Dublin (IE); Malte Christian Kaufmann, Dublin (IE); Alan McShane, Raheny (IE); Anirban Banerjee, Kilcullen (IE); and Mark Ahern, Dublin (IE)
Assigned to BUSINESS OBJECTS SOFTWARE LTD, Dublin (IE)
Filed by BUSINESS OBJECTS SOFTWARE LTD., Dublin (IE)
Filed on Jul. 6, 2021, as Appl. No. 17/367,882.
Prior Publication US 2023/0010992 A1, Jan. 12, 2023
Int. Cl. G06F 16/2458 (2019.01); G06F 16/28 (2019.01); G06F 16/2457 (2019.01)
CPC G06F 16/2462 (2019.01) [G06F 16/2457 (2019.01); G06F 16/283 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A system comprising:
a memory storing processor-executable program code; and
a processing unit to execute the processor-executable program code to cause the system to:
receive data including a plurality of discrete features, each of the plurality of discrete features associated with a plurality of discrete values;
determine, for each of the plurality of discrete features, statistics based on a number of occurrences of each discrete value of the discrete feature in the data and not based on any continuous feature value;
determine first summary statistics based on the statistics determined for each of the plurality of discrete features and not based on any continuous feature value;
determine, for each discrete feature, a dissimilarity between the first summary statistics and the statistics determined for the discrete feature;
determine N candidate discrete features of the plurality of discrete features associated with an N largest determined dissimilarities, the N candidate discrete features comprising less than all of the plurality of discrete features;
determine a first selected continuous feature of the received data;
determine, for each of the candidate discrete features, second summary statistics based on values of the first selected continuous feature associated with each discrete value of the candidate discrete feature;
determine a deviation score for each of the candidate discrete features based on the second summary statistics; and
transmit the candidate discrete features for display in association with the first selected continuous feature based on the determined deviation scores.