US 12,118,009 B2
Supporting query languages through distributed execution of query engines
Arindam Bhattacharjee, Fremont, CA (US); Sourav Pal, Foster City, CA (US); and Timothy Tully, Menlo Park, CA (US)
Assigned to Splunk Inc., San Francisco, CA (US)
Filed by Splunk Inc., San Francisco, CA (US)
Filed on Oct. 18, 2019, as Appl. No. 16/657,916.
Application 16/657,916 is a continuation in part of application No. 16/398,038, filed on Apr. 29, 2019, granted, now 11,580,107.
Application 16/398,038 is a continuation in part of application No. 16/147,165, filed on Sep. 28, 2018, granted, now 10,956,415.
Application 16/147,165 is a continuation in part of application No. 16/051,197, filed on Jul. 31, 2018, granted, now 11,663,227.
Application 16/051,197 is a continuation in part of application No. 15/665,159, filed on Jul. 31, 2017, granted, now 11,281,706.
Application 16/051,197 is a continuation in part of application No. 15/665,148, filed on Jul. 31, 2017, granted, now 10,726,009.
Application 16/051,197 is a continuation in part of application No. 15/665,187, filed on Jul. 31, 2017, granted, now 11,232,100.
Application 16/051,197 is a continuation in part of application No. 15/665,248, filed on Jul. 31, 2017, granted, now 11,163,758.
Application 16/051,197 is a continuation in part of application No. 15/665,197, filed on Jul. 31, 2017, granted, now 11,461,334.
Application 16/051,197 is a continuation in part of application No. 15/665,279, filed on Jul. 31, 2017, granted, now 11,416,528.
Application 16/051,197 is a continuation in part of application No. 15/665,302, filed on Jul. 31, 2017, granted, now 10,795,884.
Application 16/051,197 is a continuation in part of application No. 15/665,339, filed on Jul. 31, 2017, abandoned.
Prior Publication US 2020/0050612 A1, Feb. 13, 2020
Int. Cl. G06F 16/2458 (2019.01); G06F 16/2452 (2019.01)
CPC G06F 16/2471 (2019.01) [G06F 16/24526 (2019.01)] 30 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
receiving a query in a first query language to be applied to a set of data records;
parsing the query to identify multiple query stages;
generating, for each query stage of the multiple query stages, a sub-query, in the first query language, wherein each sub-query is configured to cause each of multiple worker nodes, to implement the query stage with respect to a subset of the set of data records obtained at the worker node, each sub-query representing a distinct executable query in the first query language that corresponds to a distinct query stage of the multiple query stages;
based on a determination that a first query stage of the multiple query stages corresponds to a first native operation and a determination the multiple worker nodes are configured to execute the first native operation, generating one or more instructions to execute the first native operation;
based on a determination that a second query stage of the multiple query stages does not correspond to any native operation, determining not to generate one or more instructions to execute a native operation;
generating instructions for shuffling records between the multiple worker nodes at a point in time between at least two of the multiple query stages; and
communicating the instructions for shuffling records, the one or more instructions to execute the first native operation, and the sub-query corresponding to the second query stage of the multiple query stages, to the multiple worker nodes for concurrent implementation, wherein each worker node includes a distinct executor for processing sub-queries in the first query languages.