CPC G06F 16/1744 (2019.01) [G06F 16/9024 (2019.01); G06F 16/9038 (2019.01); G06F 16/906 (2019.01); G06N 7/00 (2013.01)] | 19 Claims |
1. A computer system for automated estimation of relationships among a plurality of data elements, the system comprising:
a data receiver configured to receive one or more input data sets including the plurality of data elements, the one or more input data sets including a set of data records C;
a classifier engine computer processor configured to:
establish one or more linkage relations among data records in the set of data records C, the one or more linkage relations maintained as an adjacency matrix data structure on a non-transitory computer memory;
extend the one or more linkage relations into equivalence relations, updating the adjacency matrix data structure; and
generate an output data set by transforming the set of data records C using at least the adjacency matrix data structure, the output data set compressed relative to the set of data records C;
wherein the equivalence relations include , and D1 and D2 are defined as subsets of data records in the set of data records C, and the one or more equivalence relations are used to partition the set of data records C into one or more partitions for generating the output data set, the output data set generated by transforming the set of data records C in accordance to the one or more partitions;
wherein the equivalence relations include the relation:
given two sets of data sets 1 and 2, a transitive equivalence relation on 1, and a set function ƒ: 1→2 such that D1D2 implies ƒ(D1)=ƒ(D2) then induced functions are provided showing equivalence classes of D1 can be linked to D2;
wherein the equivalence relations include the relation:
Let 1 and 2 be equivalence relations on X and Y, respectively and let ƒ: X→Y be a function such that x1y⇒ƒ(x)2ƒ(y);
where ƒ is defined by ƒ([x])=[ƒ(x)]:
|