US 12,189,649 B2
Scaling database query processing using additional processing clusters
Ippokratis Pandis, Menlo Park, CA (US); Naresh Chainani, Mountain View, CA (US); Sebastian Hillig, Berlin (DE); Christos Stavrakakis, Berlin (DE); Eric Ray Hotinger, Redmond, WA (US); Bruce William McGaughy, Mercer Island, WA (US); William Michael McCreedy, Berlin (DE); and Yan Leshinsky, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Nov. 24, 2021, as Appl. No. 17/535,446.
Prior Publication US 2023/0161792 A1, May 25, 2023
Int. Cl. G06F 16/27 (2019.01); G06F 16/2453 (2019.01)
CPC G06F 16/27 (2019.01) [G06F 16/24542 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a plurality of computing devices implementing different respective hosts of a database service offered by a provider network, wherein the database service comprises a primary processing cluster to perform database queries to a database hosted by the database service, wherein the primary processing cluster comprises a leader node and a plurality of compute nodes hosted at different ones of the respective hosts;
wherein the leader node is configured to:
receive a database query directed to the database;
determine to use respective additional processing clusters of the database service for individual ones of the compute nodes;
generate a plan to perform the database query at the primary processing cluster, wherein the plan includes one or more operations to instruct individual ones of the compute nodes to use the respective additional processing clusters of the database service to perform the one or more operations and return results of the one or more operations to the individual ones of the compute nodes, wherein the respective additional processing clusters respectively implement a same query processing engine, as is implemented by the primary processing cluster, at an additional leader node and a plurality of additional compute nodes that receive instructions from the additional leader node engine to perform the one or more operations;
respectively instruct the compute nodes to execute the plan to perform the database query, wherein the respective instructions cause individual ones of the compute nodes to send requests to the respective additional processing clusters to perform the one or more operations of the plan; and
return a result of the database query generated based on the performance of the one or more operations of the plan at the respective additional processing clusters.