US 11,921,755 B2
Data clustering using analysis of multiple encoding techniques
Aaron K. Baughman, Cary, NC (US); Kavitha Hassan Yogaraj, Bangalore (IN); Sudeep Ghosh, Bengaluru (IN); and Shikhar Kwatra, San Jose, CA (US)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Oct. 20, 2021, as Appl. No. 17/506,315.
Prior Publication US 2023/0123240 A1, Apr. 20, 2023
Int. Cl. G06F 16/28 (2019.01); G06F 16/22 (2019.01); G06F 18/21 (2023.01); G06F 18/211 (2023.01); G06F 18/231 (2023.01); G06N 10/00 (2022.01)
CPC G06F 16/285 (2019.01) [G06F 16/2228 (2019.01); G06F 16/288 (2019.01); G06F 18/211 (2023.01); G06F 18/217 (2023.01); G06F 18/231 (2023.01); G06N 10/00 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A method for hybrid classical-quantum clustering, the method comprising:
building a hierarchical data structure using a hybrid hierarchical clustering process, wherein the hierarchical data structure comprises a plurality of objects that span a plurality of levels from a lowest level of single-object clusters to a highest level comprising a final cluster of the clusters, wherein the hybrid hierarchical clustering process comprises an iteration of a level-building process comprising:
building, by a classical processor, a first parent level of a current uppermost level of the hierarchical data structure by clustering classically-encoded clusters of the current uppermost level,
wherein, upon completion of the first parent level, the first parent level becomes the current uppermost level of the hierarchical data structure;
identifying, by a quantum processor, a set of candidate clustering options for clustering quantum-encoded clusters of the current uppermost level for a second parent level, wherein the identifying comprises forming each of the set of candidate clustering options by encoding, in parallel, data of the current uppermost level using respective different quantum encoding spaces, wherein the forming each of the set of candidate clustering options comprises forming a first candidate clustering option in a basis encoding space, forming a second candidate clustering option in an amplitude encoding space, forming a third candidate clustering option in an angle encoding space, and forming a fourth candidate clustering option in a higher-order encoding space, wherein the encoding of the data using respective different quantum encoding spaces comprises encoding the data as at least one of a superposition of a quantum space, a rotation of a quantum space, and a probability amplitude value of a quantum system wavefunction; and
building, by the classical processor, the second parent level based on a subset of the candidate clustering options;
wherein, upon completion of the second parent level, the second parent level becomes the current uppermost level of the hierarchical data structure;
determining, by the classical processor, whether to perform another iteration of at least a portion of the level-building process based at least in part on a comparison of the hierarchical data structure to an exit criterion; and
upon a determination to perform another iteration of at least a portion of the level-building process, performing another iteration of at least a portion of the level-building process.