| CPC G06N 7/023 (2013.01) | 20 Claims |

|
1. A method operable with a real system, comprising:
generating correlated histogram clusters from monitored data from sensors of said real system, comprising:
1) generating first n histograms for a first n-dimensional data set D of n dimensions;
2) selecting a subset of said first n-dimensional data set D based on a frequency greater than a threshold to create a second n-dimensional data set D′ of said n dimensions;
3) generating second n histograms for said second n-dimensional data set D′ with optimal bin size;
4) identifying m modes for each dimension of said second n-histograms;
5) for the ith mode of said jth dimension, representing a mode-dimension mij, identifying an index p in said second n-dimensional data set D′ by finding a value in j dimensions of said second n-dimensional data set D′ closest to said mode-dimension mij, and setting a value Ci of a centroid C equal to said mode-dimension mij;
6) identifying an associated mode data value D′pk for another one of k dimensions, identifying a nearest mode from said second n histograms of kth dimension to said associated mode data value D′pk, and assigning a value Ck of said centroid C to said associated mode data value D′pk;
7) repeating step 6 for each of said k dimensions through said n dimensions, k≠I;
8) saving said centroid C and repeating steps 5-7 for all m modes of said jth dimension; and
9) repeating steps 5-8 for each of said j dimensions through said n dimensions to generate said correlated histogram clusters;
generating a refined representation of said real system from an initial representation of said real system using a machine learning system, said correlated histogram clusters being an input to said initial representation and representing a transformed and reduced data set of said monitored data to reduce data processing of said machine learning system; and
controlling an operation of said real system using said refined representation.
|