US 12,013,852 B1
Unified data processing across streaming and indexed data sets
Joseph Gabriel Echeverria, San Francisco, CA (US); Arthur Foelsche, Montpelier, VT (US); Eric Sammer, San Francisco, CA (US); and Sarah Stanger, Aspinwall, PA (US)
Assigned to Splunk Inc., San Francisco, CA (US)
Filed by Splunk Inc., San Francisco, CA (US)
Filed on Mar. 27, 2023, as Appl. No. 18/190,815.
Application 18/190,815 is a continuation of application No. 17/175,518, filed on Feb. 12, 2021, granted, now 11,615,084.
Application 17/175,518 is a continuation of application No. 16/177,234, filed on Oct. 31, 2018, granted, now 10,936,585, issued on Mar. 2, 2021.
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/30 (2019.01); G05B 13/00 (2006.01); G06F 16/14 (2019.01); G06F 16/178 (2019.01); G06F 16/24 (2019.01); G06F 16/2453 (2019.01); G06F 16/2455 (2019.01); G06F 16/248 (2019.01); G06F 16/25 (2019.01); G06N 3/00 (2023.01); G06N 5/00 (2023.01)
CPC G06F 16/24532 (2019.01) [G05B 13/00 (2013.01); G06F 16/156 (2019.01); G06F 16/178 (2019.01); G06F 16/24 (2019.01); G06F 16/24556 (2019.01); G06F 16/24566 (2019.01); G06F 16/24568 (2019.01); G06F 16/248 (2019.01); G06F 16/258 (2019.01); G06N 3/00 (2013.01); G06N 5/00 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
obtaining a query for data indexed at a data processing system, the query specifying criteria for identifying search results from data items previously indexed by an indexing subsystem of the data processing system;
converting the query into a data processing pipeline, the data processing pipeline specifying a series of nodes and interconnections between individual nodes within the series, wherein the individual nodes designate a transformation of data items within the data processing pipeline, wherein the interconnections designate a routing of messages through the data processing pipeline, and wherein the series of nodes and interconnections logically represent the query;
identifying a first portion of results by applying the data processing pipeline to a streaming data processing subsystem of the data processing system to cause the streaming data processing subsystem to process data items within a first subset of messages, from a stream of messages, obtained at the data processing system during a first past period of time;
identifying a second portion of results by executing the query against a second subset of data items, from the stream of messages, previously indexed by the indexing subsystem, the second subset of data items previously indexed by the indexing subsystem including data items obtained at the data processing system during a second past period of time; and
displaying search results including the first portion of results identified by applying the data processing pipeline to cause the streaming data processing subsystem to process data items within the first subset of messages obtained at the data processing system during the first past period of time and the second portion of results identified by executing the query against the second subset of data items, from the stream of messages, previously indexed by the indexing subsystem including the data items obtained at the data processing system during the second past period of time.