US 12,321,396 B1
Generating and storing aggregate data slices in a remote shared storage system
Sai Krishna Sajja, Dublin, CA (US); Anish Shrigondekar, Sunnyvale, CA (US); and Igor Stojanovski, San Francisco, CA (US)
Filed by Splunk Inc., San Francisco, CA (US)
Filed on Jul. 31, 2020, as Appl. No. 16/945,578.
Int. Cl. G06F 16/906 (2019.01); G06F 16/901 (2019.01)
CPC G06F 16/906 (2019.01) [G06F 16/901 (2019.01)] 17 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, at an indexing node of a data intake and query system, a message payload from a remote message bus, wherein the message payload includes a first plurality of events, wherein each event of the first plurality of events includes raw machine data associated with a timestamp;
extracting the first plurality of events from the message payload;
adding a first set of events of the first plurality of events to a first editable data slice, wherein the first editable data slice is associated with a hot bucket, wherein the first set of events includes at least two events;
based on a slice rollover policy, converting the first editable data slice to a first non-editable data slice and adding the first non-editable data slice to a first aggregate data slice in volatile memory,
wherein the first aggregate data slice comprises a plurality of non-editable data slices associated with the hot bucket and each of the plurality of non-editable data slices comprises a second plurality of events, wherein the plurality of non-editable data slices includes the first non-editable data slice, wherein the first non-editable data slice includes the first set of events;
concurrent to at least one of adding the first set of events to the first editable data slice or adding the first non-editable data slice to the first aggregate data slice in volatile memory, modifying one or more files of the hot bucket using the first set of events, wherein the hot bucket includes:
at least one index file generated using the events of a plurality of aggregate data slices wherein the one or more files includes the at least one index file, and
the plurality of aggregate data slices, wherein the plurality of aggregate data slices comprises the first aggregate data slice;
based on an aggregate slice backup policy, determining whether a warm bucket corresponding to the hot bucket associated with the first aggregate data slice has been stored on a remote shared storage system, and based on a determination that a warm bucket corresponding to the hot bucket has not been stored on the remote shared storage system, communicating a first copy of the first aggregate data slice to the remote shared storage system for storage, wherein the first copy of the first aggregate data slice is communicated independent of the hot bucket; and
based on a bucket rollover policy, converting the hot bucket to a first warm bucket and communicating a copy of the first warm bucket to the remote shared storage system for storage, wherein the copy of the first warm bucket includes a second copy of the first aggregate data slice.