| CPC G06F 16/353 (2019.01) | 22 Claims |

|
1. A computer-implemented method for analyzing log messages, comprising:
receiving, by a computer processor, a plurality of log message from a list of log messages;
for each log message in the log messages, generating, by the computer processor, a fingerprint for the log message and appending the fingerprint to the log message, where the fingerprint is comprised of punctuation characters in the log message;
sorting, by the computer processor, the log messages in the list of log messages according to the punctuation characters comprising a fingerprint for a given log message; and
clustering, by the computer processor, log messages in the sorted list of log messages into a final set of clusters, where each cluster in the final set of clusters includes one or more log messages and the log messages in each cluster have fingerprints similar to other log messages in the cluster;
wherein clustering the sorted list of log messages further comprises
clustering log messages in the sorted list of log messages into a first set of clusters, where log messages in each cluster of the first set of clusters has identical fingerprints; and
computing a similarity metric between a fingerprint from a given cluster of the first set of clusters and a fingerprint from each of one or more subsequent clusters in the first set of clusters, and combining log messages from a particular cluster in the one or more subsequent clusters into the given cluster when the similarity metric for the particular cluster is less than a similarity threshold.
|