US 11,971,793 B2
Machine learning model-based dynamic prediction of estimated query execution time taking into account other, concurrently executing queries
Yuanzhe Bei, Cambridge, MA (US); and Zhihao Zheng, Cambridge, MA (US)
Assigned to Micro Focus LLC, Santa Clara, CA (US)
Filed by ENTIT Software LLC, Sanford, NC (US)
Filed on Mar. 5, 2019, as Appl. No. 16/292,990.
Prior Publication US 2020/0285642 A1, Sep. 10, 2020
Int. Cl. G06F 16/00 (2019.01); G06F 7/00 (2006.01); G06F 11/30 (2006.01); G06F 16/2453 (2019.01); G06F 16/901 (2019.01)
CPC G06F 11/3006 (2013.01) [G06F 16/24542 (2019.01); G06F 16/9027 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable data storage medium storing program code executable by a computing system on which a database management system (DBMS) is running to:
monitor how many queries are concurrently being executed against a database by the DBMS, to maintain a count of the queries concurrently being executed;
monitor current physical resources utilization of the computing system as a whole and not on a per-query basis, such that the current physical resources utilization reflects all activity of the computing system, including the queries concurrently being executed as well as other activity of the computing system;
generate a query plan for a received query to be executed against the database of the DBMS, the query plan comprising a hierarchical tree of a plurality of operators that are executable in a bottom-up manner to execute the received query, wherein during generation of the query plan query-based statistics for the received query are generated for each operator of the plurality of operators of the hierarchical tree of the query plan with respect to the received query in isolation and without consideration of the queries concurrently being executed;
provide an input vector to a machine-learning model, the input vector including each of only three types of input features:
the current physical resources utilization of the computing system as a whole and not on a per-query basis, as a first type of input feature;
the count of the queries concurrently being executed, as a second type of input feature; and
the query-based statistics for each operator of the query plan and generated during generation of the query plan, as a third type of input feature;
receive as output from the machine-learning model an estimated execution time of the received query, the machine-learning model using each of the three types of input features included in the input vector provided to the machine-learning model to dynamically predict the estimated execution time; and
execute the received query against the database, by executing the operators of the query plan, based on the estimated execution time for the received query.