US 12,259,863 B2
	Retrieval apparatus, methods, and storage medium
Chao Xie, Redwood City, CA (US); Weizhi Xu, Shanghai (CN); Songlin Wu, Shanghai (CN); Mengzhao Wang, Shanghai (CN); and Xiaomeng Yi, Shanghai (CN)
Assigned to Zilliz Inc., Redwood City, CA (US)
Filed by ZILLIZ INC., Redwood City, CA (US)
Filed on Mar. 29, 2023, as Appl. No. 18/192,027.
Prior Publication US 2024/0330260 A1, Oct. 3, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 16/22 (2019.01); G06F 16/2458 (2019.01)

CPC G06F 16/22 (2019.01) [G06F 16/2462 (2019.01)]

19 Claims

1. A retrieval apparatus comprising:

a memory including a first memory for storing a first graph index and original vectors, and

a second memory for storing product quantization (PQ) compressed vectors of the original vectors, wherein the PQ compressed vectors are obtained by compressing the original vectors through a product quantization (PQ) algorithm; and

a processor configured to:

obtain a retrieval request including a query vector;

according to the query vector, generate and execute a first access request corresponding to the first memory corresponding to the first graph index and index nodes in a candidate pool, wherein the candidate pool is used to store retrieval nodes in a current retrieval, the retrieval nodes are the index nodes corresponding to the original vectors that needs to obtain vector data from the first memory, the first graph index is a graph index including all the index nodes and neighbor relationships of the index nodes, and each index node corresponds to one of the original vectors;

obtain and process data corresponding to the first access request from the first memory when there is no redundant data in a previous storage pool, and store first access request results in a result pool, wherein the redundant data includes data of index nodes not used acquired from the first memory; and

wherein obtain and process data corresponding to the first access request from the first memory when there is no redundant data in a previous storage pool, comprises: obtaining original vectors of index nodes corresponding to the first access request and adjacency tables of the index nodes from the first memory; calculating Euclidean distances between the original vectors and the query vector;

process the redundant data when there is redundant data in the previous storage pool;

wherein process the redundant data when there is redundant data in the previous storage pool, comprises:

generating and executing a third access request corresponding to the first memory based on the redundant data and the query vector corresponding to index nodes in the previous storage pool;

calculating Euclidean distances between original vectors corresponding to the index nodes in the previous storage pool and the query vector, according to data obtained by the third access request corresponding to the first memory, and storing third access request results and corresponding index nodes in the result pool; and

output data in the result pool when the candidate pool does not include unreachable index nodes.