US 12,093,321 B2
Methods and systems for similarity searching encrypted data strings
Carolyn Phillips, Chicago, IL (US); and Venkataseshagiri Chintala, Henrico, VA (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Sep. 22, 2021, as Appl. No. 17/481,393.
Prior Publication US 2023/0086508 A1, Mar. 23, 2023
Int. Cl. G06F 16/903 (2019.01); G06F 16/93 (2019.01); G06F 21/60 (2013.01); G06F 21/62 (2013.01)
CPC G06F 16/90344 (2019.01) [G06F 16/93 (2019.01); G06F 21/602 (2013.01); G06F 21/6245 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of similarity searching encrypted data strings, the method comprising:
receiving a plurality of data strings to be encrypted;
obtaining a set of reference strings;
determining a respective set of edit distances between each data string of the plurality of data strings and the set of reference strings;
converting each respective set of edit distances into a document of tokens;
encrypting the plurality of data strings;
associating each of the documents of tokens with a corresponding encrypted data string of the plurality of encrypted data strings;
storing the plurality of encrypted data strings and the associated plurality of documents of tokens in a memory;
receiving a search request to search the plurality of encrypted data strings;
determining a search set of edit distances between the search request and the set of reference strings;
converting the search set of edit distances into a search document of tokens;
comparing the search document of tokens with the plurality of documents of tokens stored in the memory to determine which of the plurality of documents of tokens are above a predetermined similarity threshold when compared to the search document of tokens; and
returning, as a search result, the data strings that are associated with the documents of tokens of the plurality of documents of tokens that are above the predetermined similarity threshold when compared to the search document of tokens.