US 11,681,717 B2
Algorithm for the non-exact matching of large datasets
Kaiyu Pan, Johns Creek, GA (US); Richard J. Diekema, Jr., South Portland, ME (US); and Mark G. Kane, Bayside, NY (US)
Assigned to Bottomline Technologies, Inc., Portsmouth, NH (US)
Filed by Bottomline Technologies, Inc., Portsmouth, NH (US)
Filed on Oct. 6, 2022, as Appl. No. 17/961,208.
Application 17/961,208 is a continuation of application No. 17/580,204, filed on Jan. 20, 2022, granted, now 11,475,027, issued on Oct. 18, 2022.
Application 17/580,204 is a continuation of application No. 17/325,911, filed on May 20, 2021, granted, now 11,238,053, issued on Feb. 1, 2022.
Application 17/325,911 is a continuation of application No. 16/455,811, filed on Jun. 28, 2019, granted, now 11,042,555, issued on Jun. 22, 2021.
Prior Publication US 2023/0051025 A1, Feb. 16, 2023
Int. Cl. G06F 17/00 (2019.01); G06F 16/2458 (2019.01); G06F 16/903 (2019.01); G06F 17/18 (2006.01); G06F 40/109 (2020.01); G06F 18/22 (2023.01); G06F 40/166 (2020.01)
CPC G06F 16/2468 (2019.01) [G06F 16/90348 (2019.01); G06F 17/18 (2013.01); G06F 18/22 (2023.01); G06F 40/109 (2020.01); G06F 40/166 (2020.01)] 20 Claims
OG exemplary drawing
 
1. A method operating on one or more processors comprising:
receiving a target from a remote computing device;
converting all capital letters in the target to their corresponding lower case letter;
collapsing all repeated letters in the target into a single letter;
removing all accents and punctuation from the target;
converting all geographical locations and corporation designations in the target into common terms;
determining a maximum length by adding a number of characters in the target as transformed with a search distance parameter;
determining a minimum length by subtracting the number of characters in the target as transformed with the search distance parameter;
forming a subset of a search list containing only search list members where a number of search list member characters is between the minimum length and the maximum length;
calculating a score of each subset member against the target as transformed using a Levenshtein distance algorithm; and
sending an indication of whether the target is located in the search list by determining if the score is above a threshold.