US 12,298,974 B2
	Apparatus, method and storage medium for database query
Chao Xie, Redwood City, CA (US); Yu Xie, Shanghai (CN); Xiaofan Luan, Shanghai (CN); Rentong Guo, Shanghai (CN); and Xun Huang, Shanghai (CN)
Assigned to ZILLIZ INC., Redwood City, CA (US)
Filed by ZILLIZ INC., Redwood City, CA (US)
Filed on Mar. 29, 2023, as Appl. No. 18/192,381.
Prior Publication US 2024/0330286 A1, Oct. 3, 2024
Int. Cl. G06F 16/2453 (2019.01); G06F 11/34 (2006.01); G06F 16/242 (2019.01)

CPC G06F 16/24539 (2019.01) [G06F 11/3409 (2013.01); G06F 16/2425 (2019.01)]

15 Claims

1. An apparatus comprising:

a memory for storing data in a database; and

a processor configured to:

determine that a syntax tree corresponding to an SQL query statement comprises a preset subtree by parsing the SQL query statement, wherein the preset subtree is an approximate nearest neighbor search (ANNS) subtree used to indicate a query mode for querying vector data, and wherein the preset subtree is separate from a WHERE subtree of the syntax tree corresponding to the SQL query statement;

determine the query mode for querying vector data according to the preset subtree and the SQL query statement by detecting the preset subtree in the SQL query statement, wherein determining the query mode for querying vector data is based on a first query algorithm and query parameters indicated by the SQL query statement, and wherein the query parameters include a number of returned rows and a virtual column; and

query the data in the database via the first query algorithm based on the query mode to determine query results, wherein determining the query results comprises:

determining a first query result set based on sorted data in the database, the number of returned rows and the virtual column, wherein the sorted data in the database is based on sorting the data in the database based on the first query algorithm and a similarity between a query vector in the SQL query statement and the data in the database;

extracting target data from the data in the database in batches based on the first query result set; and

combining the first query result set and the target data to determine a second query result as the query results;

the processor is further configured to:

determine that a syntax tree corresponding to another SQL query statement does not comprise the preset subtree by parsing the other SQL query statement;

determine a second query mode for querying scalar data according to the other SQL query statement; and

query the data in the database via a second query algorithm based on the second query mode to obtain a third query result.