US 12,443,597 B2
Utilizing regular expression embeddings for named entity recognition systems
Jeremy Edward Goodsitt, Champaign, IL (US); Austin Grant Walters, Savoy, IL (US); Reza Farivar, Champaign, IL (US); Mark Louis Watson, Sedona, AZ (US); Anh Truong, Champaign, IL (US); Galen Rafferty, Mahomet, IL (US); and Vincent Pham, Champaign, IL (US)
Assigned to Capital One Services, LLC., McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Feb. 26, 2024, as Appl. No. 18/587,716.
Application 18/587,716 is a division of application No. 17/022,594, filed on Sep. 16, 2020, granted, now 11,914,583.
Application 17/022,594 is a continuation of application No. 16/549,786, filed on Aug. 23, 2019, granted, now 10,803,057, issued on Oct. 13, 2020.
Prior Publication US 2024/0193158 A1, Jun. 13, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 7/00 (2006.01); G06F 16/242 (2019.01); G06F 16/3329 (2025.01); G06F 16/35 (2019.01); G06N 20/00 (2019.01); G06F 16/36 (2019.01)
CPC G06F 16/243 (2019.01) [G06F 16/3329 (2019.01); G06F 16/35 (2019.01); G06N 20/00 (2019.01); G06F 16/374 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A device, comprising:
a memory; and
a processor coupled with the memory to:
determine a frequency of occurrences of a first regex pattern of a regex list in a dataset;
create a vector, the vector specifying a vector position and a regex value for at least the first regex pattern;
adjust the regex value corresponding to the first regex pattern by addition of a predefined value for each occurrence of the first regex pattern in the dataset;
detect a set of false matches between the first regex pattern of the regex list and the dataset;
adjust, based on the set of false matches, the regex list via a machine learning model or a classification model; and
provide the vector to at least one entity recognition system.