US 12,235,914 B2
Systems and methods for improving search result personalization and contextualization using machine learning models
Jingbo Liu, Princeton, NJ (US); Jun Zhao, Jersey City, NJ (US); Zheng Yan, Short Hills, NJ (US); Weiqi Tong, Long Island City, NY (US); and Nitin Shailesh Baliga, San Jose, CA (US)
Assigned to WALMART APOLLO, LLC, Bentonville, AR (US)
Filed by Walmart Apollo, LLC, Bentonville, AR (US)
Filed on Jan. 30, 2022, as Appl. No. 17/588,334.
Prior Publication US 2023/0244727 A1, Aug. 3, 2023
Int. Cl. G06F 16/9535 (2019.01); G06N 20/20 (2019.01)
CPC G06F 16/9535 (2019.01) [G06N 20/20 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and
one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising:
in response to receiving search queries at a search engine, storing search event data and ranking features in one or more databases, wherein the ranking features are stored separately from the search event data;
prior to generating a training dataset, supplementing the search event data with ranking feature values of the ranking features, wherein the ranking feature values were previously utilized to generate previous search results when the search queries were submitted;
generating, using the search event data supplemented with the ranking feature values of the ranking features, the training dataset comprising training event samples;
executing a hybrid labeling procedure that assigns labels to the training event samples based, at least in part, on individual engagement information associated with the training event samples, wherein executing the hybrid labeling procedure comprises:
generating, via a deep learning model, respective relevance scores for the search queries that are tail queries;
applying a first set of labels of the labels to a first portion of the training event samples determined to have positive engagement, the first set of labels being assigned to the first portion of the training event samples based on engagement activity types;
applying a second set of labels of the labels to a second portion of the training event samples that have negative engagement, the second set of labels being assigned to the second portion of the training event samples based on aggregated engagement information for items across global users, wherein the aggregated engagement information is based on a frequency of engagement by the global users on a global scale, and wherein the second set of labels are not eliminated and are assigned lower values than the first set of labels in search results; and
adjusting a subset of the first set of labels and the second set of labels that are associated with the tail queries based on the respective relevance scores associated with the tail queries; and
training a personalized ranking model to rank the search results using the training event samples and the labels.