US 11,748,358 B2
Feedback on inferred sourcetypes
Adam Oliner, San Francisco, CA (US); Eric Sammer, San Francisco, CA (US); Kristal Curtis, San Francisco, CA (US); and Nghi Nguyen, Union City, CA (US)
Assigned to Splunk Inc., San Francisco, CA (US)
Filed by Splunk, Inc., San Francisco, CA (US)
Filed on Oct. 30, 2018, as Appl. No. 16/175,642.
Claims priority of provisional application 62/738,896, filed on Sep. 28, 2018.
Claims priority of provisional application 62/738,901, filed on Sep. 28, 2018.
Prior Publication US 2020/0104731 A1, Apr. 2, 2020
Int. Cl. G06F 16/245 (2019.01); G06F 16/2455 (2019.01); G06F 40/205 (2020.01); G06F 16/248 (2019.01); G06N 5/04 (2023.01)
CPC G06F 16/24568 (2019.01) [G06F 16/248 (2019.01); G06F 16/24564 (2019.01); G06F 40/205 (2020.01); G06N 5/04 (2013.01)] 28 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
generating a representation of a portion of machine data of a message of a data stream, wherein the data stream is accessed from an ingestion buffer of a data system, the portion of machine data generated by one or more components in an information technology environment;
predicting, based at least on applying the representation of the portion of machine data to an inference model, a sourcetype of the message, wherein predicting the sourcetype of the message using the inference model efficiently processes the message via one or more system components by reducing processing time of the message having an absence of an associated sourcetype or an inaccurate sourcetype;
based on the sourcetype of the message, selecting a set of extraction rules associated with the sourcetype for extraction of a set of values from the message, wherein each extraction rule defines criteria for identifying a sub-portion of text from the portion of machine data of the message to produce a value of the set of values, the value representing the sub-portion of text;
executing the extraction based at least on applying the set of extraction rules to the portion of machine data of the message to produce a result set that indicates the set of values identified using the set of extraction rules; and
based on the sourcetype and the set of values indicated by the result set, executing at least one action that includes routing, via a router, one or more messages associated with the sourcetype from the data stream to one or more endpoints associated with the sourcetype, wherein the at least one action is based at least in part on comparing a number of fields of the result set identified using the set of extraction rules to a number of fields associated with the sourcetype, and wherein at least one endpoint comprises a field-searchable data store and the message is stored in the field-searchable data store as an event that is accessed from the field-searchable data store responsive to a search query containing a criterion for a field being executed against the event in the field-searchable data store to cause comparison between the criterion and values extracted from the event by an extraction rule defining the field.