CPC G06F 16/2255 (2019.01) [G06F 16/2264 (2019.01); G06F 16/24556 (2019.01); G06F 16/2477 (2019.01); G06F 16/285 (2019.01)] | 20 Claims |
1. A method comprising:
performing, by a computing system that implement having one or more hardware processors and associated memory:
receiving a plurality of events from a remote network and storing the events in an event log repository;
creating in memory a time-sliced approximate data structure (TSADS) for the events in the event log repository, wherein the TSADS includes a counts matrix and a statistics matrix and:
(a) the statistics matrix is used to store approximate statistics for different groups of timestamped datapoints in a plurality of time slices, and
(b) the counts matrix implements a count-min sketch to store approximate counts of datapoints in the different groups in the time slices;
receiving, via a query user interface, a query directed to the event log repository specifying to retrieve, from the TSADS, approximate statistics for a group of datapoints in the time slices, wherein the query specifies a group key of the group and a time range for retrieval;
responding to the query with approximate statistics for individual ones of the time slices, including:
selecting a set of cells in the count-min sketch in the counts matrix based on the group key and the time slice, wherein each cell in the set stores an approximate count of datapoints in the group in the time slice;
determining a first cell from the set that stores a best approximate count;
determining a best approximate statistic of the group in the time slice, wherein the best approximate statistic is retrieved from a second cell in the statistics matrix that corresponds to the first cell in the counts matrix; and
returning a time series of best approximate statistics of the group determined for each time slice, wherein the time series corresponds to the time range specified by the query.
|