US 12,468,969 B2
Methods for correlated histogram clustering for machine learning
Randal Allen, Orlando, FL (US); and Brice Brosig, Denton, TX (US)
Assigned to INCUCOMM, INC., Addison, TX (US)
Filed by Incucomm, Inc., Addison, TX (US)
Filed on Jun. 21, 2022, as Appl. No. 17/808,093.
Claims priority of provisional application 63/202,667, filed on Jun. 21, 2021.
Prior Publication US 2023/0119704 A1, Apr. 20, 2023
Int. Cl. G06N 7/02 (2006.01)
CPC G06N 7/023 (2013.01) 20 Claims
OG exemplary drawing
 
1. A method operable with a real system, comprising:
generating correlated histogram clusters from monitored data from sensors of said real system, comprising:
1) generating first n histograms for a first n-dimensional data set D of n dimensions;
2) selecting a subset of said first n-dimensional data set D based on a frequency greater than a threshold to create a second n-dimensional data set D′ of said n dimensions;
3) generating second n histograms for said second n-dimensional data set D′ with optimal bin size;
4) identifying m modes for each dimension of said second n-histograms;
5) for the ith mode of said jth dimension, representing a mode-dimension mij, identifying an index p in said second n-dimensional data set D′ by finding a value in j dimensions of said second n-dimensional data set D′ closest to said mode-dimension mij, and setting a value Ci of a centroid C equal to said mode-dimension mij;
6) identifying an associated mode data value D′pk for another one of k dimensions, identifying a nearest mode from said second n histograms of kth dimension to said associated mode data value D′pk, and assigning a value Ck of said centroid C to said associated mode data value D′pk;
7) repeating step 6 for each of said k dimensions through said n dimensions, k≠I;
8) saving said centroid C and repeating steps 5-7 for all m modes of said jth dimension; and
9) repeating steps 5-8 for each of said j dimensions through said n dimensions to generate said correlated histogram clusters;
generating a refined representation of said real system from an initial representation of said real system using a machine learning system, said correlated histogram clusters being an input to said initial representation and representing a transformed and reduced data set of said monitored data to reduce data processing of said machine learning system; and
controlling an operation of said real system using said refined representation.