CPC G06F 16/285 (2019.01) [G06F 16/355 (2019.01); G06F 16/951 (2019.01)] | 18 Claims |
1. A method, said method comprising:
finding, by a processor of a computer system, an alignment between source centroids of a source domain and target centroids of a target domain over a cross-domain similarity graph, said cross-domain similarity graph comprising a first set of vertices corresponding to the source centroids and a second set of vertices corresponding to the target centroids wherein a weight is assigned to each edge between one of the vertices in the first set vertices corresponding to the source centroids and one of the vertices in the second set vertices corresponding to the target centroids, said finding the alignment comprising determining a subset of the edges having a maximum sum of the weights as compared with a sum of the weights for all other subsets of the edges, subject to each vertex connected by the edges in the subset being spanned by at most one edge in the subset;
said processor calculating target clusterability as an average of a respective clusterability of at least one target data item comprised by the target domain;
said processor calculating target-side matchability as an average of a respective matchability of each target centroid of the target domain to source centroids of the source domain, wherein the source domain comprises at least one source data item;
said processor calculating source-side matchability as an average of a respective matchability of each source centroid of said source centroids to the target centroids;
said processor calculating source-target pair matchability as an average of the target-side match ability and the source-side matchability; and
said processor calculating cross-domain clusterability between the target domain and the source domain as a linear combination of the calculated target clusterability and the calculated source-target pair matchability, said cross-domain clusterability providing an improved clustering over conventional data clustering due to the alignment between the source centroids and the target centroids; and
said processor transferring the calculated cross-domain clusterability over a network to a first device selected from the group consisting of an output device of a computer system, a storage device of the computer system, a remote computer system coupled to the computer system, and a combination thereof.
|