US 11,755,630 B2
Regular expression generation using longest common subsequence algorithm on combinations of regular expression codes
Michael Malak, Denver, CO (US); Luis E. Rivas, Denver, CO (US); and Mark L. Kreider, Arvada, CO (US)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Apr. 1, 2022, as Appl. No. 17/711,907.
Application 17/711,907 is a continuation of application No. 16/438,321, filed on Jun. 11, 2019, granted, now 11,321,368.
Claims priority of provisional application 62/749,001, filed on Oct. 22, 2018.
Claims priority of provisional application 62/684,498, filed on Jun. 13, 2018.
Prior Publication US 2022/0261426 A1, Aug. 18, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/30 (2019.01); G06F 16/332 (2019.01); G06F 16/35 (2019.01); G06F 16/25 (2019.01); G06F 16/903 (2019.01); G06F 9/451 (2018.01); G06F 16/33 (2019.01); G06F 16/2452 (2019.01); G06F 3/0482 (2013.01); G06F 3/14 (2006.01); G06F 40/10 (2020.01); G06F 40/126 (2020.01); G06F 40/146 (2020.01); G06F 40/177 (2020.01); G06V 30/196 (2022.01); G06F 18/2323 (2023.01); H04L 67/01 (2022.01)
CPC G06F 16/3329 (2019.01) [G06F 3/0482 (2013.01); G06F 3/14 (2013.01); G06F 9/451 (2018.02); G06F 16/24522 (2019.01); G06F 16/258 (2019.01); G06F 16/334 (2019.01); G06F 16/3322 (2019.01); G06F 16/35 (2019.01); G06F 16/90344 (2019.01); G06F 18/2323 (2023.01); G06F 40/10 (2020.01); G06F 40/126 (2020.01); G06F 40/146 (2020.01); G06F 40/177 (2020.01); G06V 30/1983 (2022.01); H04L 67/01 (2022.05)] 20 Claims
OG exemplary drawing
 
1. A method of generating regular expressions using a longest common subsequence (LCS) algorithm, the comprising:
receiving, by a regular expression generator comprising one or more processors, input data identifying a plurality of character sequences;
converting, by the regular expression generator, each of the plurality of character sequences into a corresponding set of regular expression codes, resulting in a plurality of sets of regular expression codes;
performing, by the regular expression generator, a plurality of executions of the longest common subsequence (LCS) algorithm and capturing a plurality of outputs of the LCS algorithm, wherein the LCS algorithm is performed on every unique two-set combination of the plurality of sets of regular expression codes;
storing, by the regular expression generator, data defining a fully-connected graph, the data comprising:
a plurality of nodes, wherein each node of the fully-connected graph corresponds to one of the plurality of sets of regular expression codes; and
a plurality of edges connecting each unique pair of the plurality of nodes, wherein an edge length between each of the unique pairs of nodes is defined by an output of the LCS algorithm executed on the regular expression codes corresponding to the unique pair of nodes;
determining, by the regular expression generator, a minimum spanning tree for the fully-connected graph; and
traversing, by the regular expression generator, the minimum spanning tree for the fully-connected graph, to determine an order for identifying a first longest common subsequence within the plurality of character sequences.