US 12,455,804 B2
Low-cost and zero-shot online log parsing method based on large language model
Chen Zhi, Hangzhou (CN); Liye Cheng, Hangzhou (CN); Meilin Liu, Hangzhou (CN); Xuhong Zhang, Hangzhou (CN); Xinkui Zhao, Hangzhou (CN); Shuiguang Deng, Hangzhou (CN); and Jianwei Yin, Zhejiang (CN)
Assigned to ZHEJIANG UNIVERSITY, Hangzhou (CN)
Appl. No. 18/857,168
Filed by ZHEJIANG UNIVERSITY, Zhejiang (CN)
PCT Filed Mar. 26, 2024, PCT No. PCT/CN2024/083762
§ 371(c)(1), (2) Date Oct. 15, 2024,
PCT Pub. No. WO2025/077116, PCT Pub. Date Apr. 17, 2025.
Prior Publication US 2025/0117307 A1, Apr. 10, 2025
Int. Cl. G06F 17/00 (2019.01); G06F 11/34 (2006.01); G06F 40/186 (2020.01); G06F 40/205 (2020.01); G06F 40/30 (2020.01)
CPC G06F 11/3476 (2013.01) [G06F 40/186 (2020.01); G06F 40/205 (2020.01); G06F 40/30 (2020.01)] 8 Claims
OG exemplary drawing
 
1. A computer-implemented low-cost and zero-shot online log parsing method based on a large language model stored in a non-transitory computer-readable medium, comprising the following steps:
(S10) firstly, using different regular expressions to extract content of a log for different log sources, then, using a pre-defined rule to replace variables in the content of the log with wildcards, and finally, detecting whether the log contains only one word, and when the log contains only one word, directly adding the word to a log template database without parsing by the large language model;
(S20-S30) firstly, querying the log template database, converting parsed log templates into regular expressions, and performing regular expression matching with new incoming logs; and when the matching is successful, updating log samples corresponding to the log templates, otherwise invoking the large language model to parse the log to generate a new template,
wherein the method for invoking the large language model to parse the log to generate a new template is implemented by: filling the content of the log obtained previously into a prompt word, and then invoking interfaces of different large language models to conduct a dialogue to acquire a result of parsing the log by the large language model, and extracting a log template therefrom;
(S40) for the log template obtained by invoking the interfaces of the large language models, firstly determining whether the template is capable of performing regular expression matching with an original log, and when the template is incapable of performing regular expression matching with the original log, performing correction;
(S50) when a new template is generated, finding similar templates through clustering, and when similarity exceeds a set threshold, merging the templates;
(S60) performing frequency analysis on a sample of the log template, and when an occurrence frequency of words at certain positions exceeds a threshold, as a constant part of the template, splitting the template based on the words; and
(S70) after the template is obtained, performing post-processing to ensure that the obtained template conforms to the specification, and then storing the template in the database.