US 12,306,833 B2
Robust query execution plan selection using machine learning with predictive uncertainties
Seyed Mohammad Amin Kamali, Orleans (CA); Calisto Zuzarte, Pickering (CA); Vincent Corvinelli, Mississauga (CA); Brandon Lewis Frendo, Markham (CA); Vasiliki Kantere, Ottawa (CA); and Ning Wang, Ottawa (CA)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Aug. 10, 2023, as Appl. No. 18/447,465.
Claims priority of provisional application 63/510,914, filed on Jun. 29, 2023.
Prior Publication US 2025/0013641 A1, Jan. 9, 2025
Int. Cl. G06F 16/2453 (2019.01); G06F 11/34 (2006.01)
CPC G06F 16/24542 (2019.01) [G06F 11/3409 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A computer-implemented method for robust query execution plan selection, the method comprising:
training a model to estimate, for an input comprising a query and one or more plans, an execution time for each plan and a respective uncertainty of the execution time for each plan;
inputting, to the model, a new query and a search space comprising a plurality of candidate plans;
given an estimated distribution for the execution time of each candidate plan characterized by the estimated execution time and respective uncertainty, computing a suboptimality risk for each candidate plan compared to the other candidate plans, wherein suboptimality risk is defined as a ratio of a cost of the respective execution plan to a cost of an optimal execution plan; and
selecting, responsive to an output from the model, a plan of the plurality of candidate plans, wherein the plan is selected according to a plan selection policy, wherein the plan selection policy comprises at least one of:
selecting a plan by assuming that candidate plans have higher costs proportional to an estimated standard deviation of their respective model uncertainty, data uncertainty, or total uncertainty; and
selecting a plan with minimum suboptimality risk using model uncertainty, data uncertainty, or a total uncertainty; and
executing the new query using the selected plan of the plurality of candidate plans rather than an alternative plan having a lower execution time estimate, thereby limiting a maximum suboptimality of the execution time.