US 11,914,583 B2
Utilizing regular expression embeddings for named entity recognition systems
Jeremy Edward Goodsitt, Champaign, IL (US); Austin Grant Walters, Savoy, IL (US); Reza Farivar, Champaign, IL (US); Mark Louis Watson, Sedona, AZ (US); Anh Truong, Champaign, IL (US); Galen Rafferty, Mahomet, IL (US); and Vincent Pham, Champaign, IL (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Sep. 16, 2020, as Appl. No. 17/022,594.
Application 17/022,594 is a continuation of application No. 16/549,786, filed on Aug. 23, 2019, granted, now 10,803,057.
Prior Publication US 2021/0056099 A1, Feb. 25, 2021
Int. Cl. G06F 7/00 (2006.01); G06F 16/242 (2019.01); G06F 16/332 (2019.01); G06F 16/35 (2019.01); G06N 20/00 (2019.01); G06F 16/36 (2019.01)
CPC G06F 16/243 (2019.01) [G06F 16/3329 (2019.01); G06F 16/35 (2019.01); G06N 20/00 (2019.01); G06F 16/374 (2019.01)] 12 Claims
OG exemplary drawing
 
1. A method comprising:
determining, via one or more processors, whether a dataset comprises unstructured text;
determining, via the one or more processors, that at least a portion of the unstructured text corresponds to a regex pattern of a regex list, wherein the regex pattern comprises at least one metacharacter, the at least one metacharacter associated with a non-literal meaning;
replacing, via the one or more processors, the portion of the unstructured text with an encoding that represents the regex pattern to generate a modified dataset; and
providing at least the modified dataset to at least one entity recognition system.