US 12,353,481 B2
Generating probabilistic data structures for lookup tables in computer memory for multi-token searching
Subramanian Viswanathan, San Ramon, CA (US); Anand Balasubramanian, Bangalore (IN); Milap Shah, Bangalore (IN); and Tianchen Cai, Milpitas, CA (US)
Assigned to OneTrust, LLC, Atlanta, GA (US)
Filed by OneTrust LLC, Atlanta, GA (US)
Filed on Aug. 15, 2023, as Appl. No. 18/450,080.
Prior Publication US 2025/0061154 A1, Feb. 20, 2025
Int. Cl. G06F 16/00 (2019.01); G06F 16/901 (2019.01); G06F 16/906 (2019.01); G06F 16/93 (2019.01)
CPC G06F 16/906 (2019.01) [G06F 16/9017 (2019.01); G06F 16/93 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
determining, by processing hardware, a probabilistic data structure comprising a bit vector with sets of bit values mapped to a plurality of items in a lookup list, the plurality of items including one or more multi-token items;
determining, by the processing hardware, a maximum number of tokens of the one or more multi-token items in the lookup list;
determining, by the processing hardware and for a selected token from text content in a digital document, a set of sequential tokens including the selected token based on the maximum number of tokens;
generating, by the processing hardware, classifications for the text content in the digital document of a digital data repository by iteratively:
comparing the set of sequential tokens to the sets of bit values mapped to the plurality of items in the lookup list;
reducing a number of tokens in the set of sequential tokens for a subsequent comparison; and
providing, for display within a graphical user interface of a client device, indications of the classifications of the text content in the digital document relative to the plurality of items in the lookup list.