US 11,704,313 B1
Parallel branch operation using intermediary nodes
Asha Andrade, Saratoga, CA (US); Tingting Bao, San Jose, CA (US); Vanco Buca, San Jose, CA (US); Weichao Duan, Cupertino, CA (US); Anuradha Pariti, San Jose, CA (US); and Xiaowei Wang, Santa Clara, CA (US)
Assigned to Splunk Inc., San Francisco, CA (US)
Filed by Splunk Inc., San Francisco, CA (US)
Filed on Oct. 19, 2020, as Appl. No. 17/74,236.
Int. Cl. G06F 16/2453 (2019.01)
CPC G06F 16/24535 (2019.01) [G06F 16/24532 (2019.01); G06F 16/24537 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving a query at a search head of a data intake and query system;
parsing the query;
based on parsing the query:
determining that the query includes a join command to join a first set of data and a second set of data, and
in response to determining that the query includes the join command, identifying a first portion of the query as a first subquery and a second portion of the query as a second subquery, wherein the first subquery corresponds to the first set of data and the second subquery corresponds to the second set of data;
identifying at least a first search node and a second search node of a plurality of search nodes instantiated within the data intake and query system based on determining that the query includes the join command;
generating instructions for a first intermediary node of the data intake and query system to combine partial results of the first subquery and the second subquery as at least part of a join operation;
generating instructions to concurrently transmit each of the first subquery and the second subquery to each of the first search node and the second search node;
executing the query, wherein executing the query includes communicating the instructions for the first intermediary node to the first intermediary node and concurrently transmitting each of the first subquery and the second subquery to each of the first search node and the second search node,
wherein each of the first search node and the second search node concurrently executes each of the first subquery and the second subquery,
wherein each of the first search node and the second search node identifies respective first partial results based on execution of the first subquery and respective second partial results based on execution of the second subquery,
wherein each of the first search node and the second search node provides the respective first partial results and the respective second partial results to the first intermediary node,
wherein the first intermediary node concurrently receives the respective first partial results and the respective second partial results from each of the first search node and the second search node, and combines the respective first partial results and the respective second partial results from each of the first search node and the second search node as at least part of the join operation, and communicates results of the join operation to at least one of a second intermediary node or the search head; and
receiving, at the search head, results of the query from at least one of the first intermediary node, the second intermediary node, or another intermediary node.