US 12,346,321 B2
Method and apparatus for querying similar vectors in a candidate vector set
Song Xu, Shanghai (CN); Lanxin Zhang, Shanghai (CN); Yufeng Qu, Shanghai (CN); and Chunyi Li, Shanghai (CN)
Assigned to MONTAGE TECHNOLOGY CO., LTD., Shanghai (CN)
Filed by MONTAGE TECHNOLOGY CO., LTD., Shanghai (CN)
Filed on Apr. 12, 2022, as Appl. No. 17/718,983.
Claims priority of application No. 202110393852.1 (CN), filed on Apr. 13, 2021.
Prior Publication US 2022/0327128 A1, Oct. 13, 2022
Int. Cl. G06F 16/2453 (2019.01)
CPC G06F 16/24542 (2019.01) [G06F 16/24539 (2019.01)] 8 Claims
OG exemplary drawing
 
1. A method for querying in a candidate vector set candidate vectors similar to object vectors, wherein the candidate vector set comprises a plurality of candidate vectors each being quantized as having a central vector portion and a residual vector portion, and the candidate vector set comprises a plurality of candidate vector subsets, the method comprising:
acquiring a set of object vectors;
querying, for each object vector of the set of object vectors, a first number of candidate vector subsets that are closest to the object vector;
generating and storing a plurality of common calculation results in a cache based on a set of central vector portions and a set of residual vector portions of candidate vectors of the first number of candidate vector subsets;
generating and storing pre-calculation results as a pre-calculation result table in the form of a look-up table based on the set of object vectors and the set of residual vector portion; and
determining, for each object vector of the set of object vectors, a second number of candidate vectors that are similar to the object vector among the candidate vectors in the corresponding first number of candidate vector subsets based on the stored pre-calculation results and common calculation results,
wherein generating the plurality of common calculation results is performed offline and generating the pre-calculation results is performed online;
wherein each of the central vector portions, the residual vector portions and the object vectors are divided into M segments using IVF-PQ algorithm, wherein M is a natural number greater than 1, and a common calculation result is expressed as: (xq−Ci)2+(pq_centroids(k,l))2+2*(Ci|pq_centroids(k,l)), wherein xq denotes an object vector, and pq_centroids (k, l) denotes a residual vector portions, 1 denotes an 1-th segment of a residual vector portion, 1=1 . . . M, and k denotes a quantized value of the 1-th segment of a residual vector portion; Ci denotes the i-th central vector portion associated with a candidate vector, and i is a natural number; and
wherein a pre-calculation result is expressed as: −2(xq|pq_centroids(k,l)).