US 12,229,134 B2
System and method for efficient query processing
Venkatesh S. Gopal, Overland Park, KS (US); Brajesh Pandey, Shrewsbury, MA (US); and Nadiya Kochura, Bolton, MA (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Apr. 27, 2023, as Appl. No. 18/307,935.
Prior Publication US 2024/0362220 A1, Oct. 31, 2024
Int. Cl. G06F 16/24 (2019.01); G06F 11/34 (2006.01); G06F 16/22 (2019.01); G06F 16/2453 (2019.01); G06F 16/27 (2019.01)
CPC G06F 16/24542 (2019.01) [G06F 11/3409 (2013.01); G06F 16/2282 (2019.01); G06F 16/278 (2019.01)] 25 Claims
OG exemplary drawing
 
1. A method comprising:
receiving a query to be executed over a plurality of data records, wherein the plurality of data records comprises a first set of data records stored in a first database distributed across a plurality of partitions and a second set of data records stored in a second database, wherein the second database comprises a different database architecture from the first database;
generating an input vector based at least in part on values for a plurality of features based on the query, the first database, and the second database;
processing the input vector using a machine learning (ML) model to predict a cost for executing the query if one or more data records in the second set of data records are loaded to a first partition of the plurality of partitions of the first database, wherein the ML model is trained to predict the cost of the query, comprising:
generating one or more output query costs by using a plurality of training features as inputs; and
adjusting one or more parameters of the ML model to reduce a difference between the one or more output query costs and one or more historical query costs; and
selecting a plan for loading the one or more data records from the second set of data records to the first partition based on the predicted cost for executing the query.