US 12,277,148 B2
Entity linking and filtering using efficient search tree and machine learning representations
Sundeep Gullapudi, Singapore (SG); Rajesh Vellore Arumugam, Singapore (SG); Matthias Frank, Heidelberg (DE); and Wei Xia, Singapore (SG)
Assigned to SAP SE, Walldorf (DE)
Filed by SAP SE, Walldorf (DE)
Filed on Apr. 19, 2022, as Appl. No. 17/723,586.
Prior Publication US 2023/0334070 A1, Oct. 19, 2023
Int. Cl. G06F 16/31 (2019.01); G06F 16/33 (2019.01); G06F 16/332 (2019.01); G06F 16/3332 (2025.01); G06F 40/284 (2020.01)
CPC G06F 16/322 (2019.01) [G06F 16/332 (2019.01); G06F 16/3334 (2019.01); G06F 40/284 (2020.01)] 14 Claims
OG exemplary drawing
 
1. A computer-implemented method for matching a query item to one or more target items using machine learning (ML) models, the method being executed by one or more processors and comprising:
receiving query item text associated with a query item that is to be matched to one or more target items of a superset of target items, the query item text comprising one or more query item tokens;
prior to using a second ML model during inference to match the query item to one or more target items of the superset of target items, providing a set of target items from the superset of target items by:
for a first query item token of the query item text:
determining, by a first ML model, a first query item token embedding,
comparing the first query item token embedding to target item token embeddings of target items tokens included within a search space to identify at least one target item token that is sufficiently similar to the first query item token, and
associating the first query item token with a revised search space within a tracker, the tracker comprising an array data structure that is initialized with a set of null values, associating the first query item token with the revised search space within the tracker comprises replacing a null value with a search space index indicating where the first query item token was found in the search space, and wherein the revised search space is provided in a queue of search spaces, a length of the queue being defined by a window parameter;
determining a set of matched item tokens based on the tracker, the set of matched item tokens indicating one of a match and a partial match between a query item token and a target item token, and further based on the length between the items being within a window size of the window parameter;
defining the set of target items from the set of matched item tokens, a number of target items in the set of target items being less than a number of target items in the superset of target items;
determining that no target item tokens represented in the revised search space is sufficiently similar to a second query item token, and in response, comparing a second query item token embedding to target item token embeddings of target items tokens included within an alternative search space present in the queue; and
executing inference to match the query item to one or more target items in the set of target items by processing the query item and target items of the set of target items through the second ML model that outputs inference results, the inference results indicating a match between the query item and at least one target item in the set of target items.