US 12,229,107 B2
Compact probabilistic data structure for storing streamed log lines
Julian Reichinger, Sankt Johann Am Wimberg (AT)
Assigned to Dynatrace LLC, Waltham, MA (US)
Filed by Dynatrace LLC, Waltham, MA (US)
Filed on Mar. 9, 2023, as Appl. No. 18/119,331.
Claims priority of provisional application 63/437,865, filed on Jan. 9, 2023.
Prior Publication US 2024/0256513 A1, Aug. 1, 2024
Int. Cl. G06F 16/22 (2019.01); G06F 16/2457 (2019.01)
CPC G06F 16/2282 (2019.01) [G06F 16/24573 (2019.01)] 21 Claims
OG exemplary drawing
 
1. A computer-implemented method for storing log data generated in a distributed computing environment, comprising:
receiving a data element from a log line, where the data element is associated with a given computing source at which the log line was produced;
applying a hash function to the data element to generate a hash value;
updating a listing of computing entities with the given computing source, where entries in the listing of computing entities can identify more than one computing source and each entry in the listing of computing entities specifies a unique set of computing sources;
storing the hash value, along with an address, in a token map table of a probabilistic data structure, where the address maps the hash value to an entry in the listing of computing entities; and
the addresses in the token map table and encoding the entries in the listing of computing entities, wherein the probabilistic data structure is stored in a file format having three sequential sections, where a first section of the file format contains a version number and information describing encoding steps applied to the data stored in the probabilistic data structure, a second section of the file format contains header information, and a third section of the file format contains the data stored in the probabilistic data structure.