CPC G06F 16/3329 (2019.01) [G06F 3/0482 (2013.01); G06F 3/14 (2013.01); G06F 9/451 (2018.02); G06F 16/24522 (2019.01); G06F 16/258 (2019.01); G06F 16/334 (2019.01); G06F 16/3322 (2019.01); G06F 16/35 (2019.01); G06F 16/90344 (2019.01); G06F 18/2323 (2023.01); G06F 40/10 (2020.01); G06F 40/126 (2020.01); G06F 40/146 (2020.01); G06F 40/177 (2020.01); G06V 30/1983 (2022.01); H04L 67/01 (2022.05)] | 20 Claims |
1. A method of generating regular expressions using a longest common subsequence (LCS) algorithm, the comprising:
receiving, by a regular expression generator comprising one or more processors, input data identifying a plurality of character sequences;
converting, by the regular expression generator, each of the plurality of character sequences into a corresponding set of regular expression codes, resulting in a plurality of sets of regular expression codes;
performing, by the regular expression generator, a plurality of executions of the longest common subsequence (LCS) algorithm and capturing a plurality of outputs of the LCS algorithm, wherein the LCS algorithm is performed on every unique two-set combination of the plurality of sets of regular expression codes;
storing, by the regular expression generator, data defining a fully-connected graph, the data comprising:
a plurality of nodes, wherein each node of the fully-connected graph corresponds to one of the plurality of sets of regular expression codes; and
a plurality of edges connecting each unique pair of the plurality of nodes, wherein an edge length between each of the unique pairs of nodes is defined by an output of the LCS algorithm executed on the regular expression codes corresponding to the unique pair of nodes;
determining, by the regular expression generator, a minimum spanning tree for the fully-connected graph; and
traversing, by the regular expression generator, the minimum spanning tree for the fully-connected graph, to determine an order for identifying a first longest common subsequence within the plurality of character sequences.
|