US 12,436,989 B2
Fast clustering of log messages
Esteban Pérez Wohlfeil, Linz (AT)
Assigned to Dynatrace LLC, Boston, MA (US)
Filed by Dynatrace LLC, Waltham, MA (US)
Filed on Nov. 22, 2023, as Appl. No. 18/517,930.
Claims priority of provisional application 63/433,160, filed on Dec. 16, 2022.
Prior Publication US 2024/0202227 A1, Jun. 20, 2024
Int. Cl. G06F 16/35 (2025.01); G06F 16/353 (2025.01)
CPC G06F 16/353 (2019.01) 22 Claims
OG exemplary drawing
 
1. A computer-implemented method for analyzing log messages, comprising:
receiving, by a computer processor, a plurality of log message from a list of log messages;
for each log message in the log messages, generating, by the computer processor, a fingerprint for the log message and appending the fingerprint to the log message, where the fingerprint is comprised of punctuation characters in the log message;
sorting, by the computer processor, the log messages in the list of log messages according to the punctuation characters comprising a fingerprint for a given log message; and
clustering, by the computer processor, log messages in the sorted list of log messages into a final set of clusters, where each cluster in the final set of clusters includes one or more log messages and the log messages in each cluster have fingerprints similar to other log messages in the cluster;
wherein clustering the sorted list of log messages further comprises
clustering log messages in the sorted list of log messages into a first set of clusters, where log messages in each cluster of the first set of clusters has identical fingerprints; and
computing a similarity metric between a fingerprint from a given cluster of the first set of clusters and a fingerprint from each of one or more subsequent clusters in the first set of clusters, and combining log messages from a particular cluster in the one or more subsequent clusters into the given cluster when the similarity metric for the particular cluster is less than a similarity threshold.