US 12,147,503 B2
Debiasing training data based upon information seeking behaviors
Donghyun Kim, Mountain View, CA (US); Liuqing Li, Blacksburg, VA (US); Yufeng Ma, Sunnyvale, CA (US); Yu Wang, San Jose, CA (US); Rao Shen, Sunnyvale, CA (US); and Kostas Tsioutsiouliklis, Saratoga, CA (US)
Assigned to Yahoo Assets LLC, New York, NY (US)
Filed by Verizon Media Inc., New York, NY (US)
Filed on Mar. 5, 2021, as Appl. No. 17/192,947.
Prior Publication US 2022/0284242 A1, Sep. 8, 2022
Int. Cl. G06N 20/00 (2019.01); G06F 11/34 (2006.01); G06F 18/21 (2023.01); G06F 18/2113 (2023.01)
CPC G06F 18/2163 (2023.01) [G06F 11/3438 (2013.01); G06F 18/2113 (2023.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
executing, on a processor of a computing device, instructions that cause the computing device to perform operations, the operations comprising:
segmenting users associated with a set of training data into information seeking behavior groups that each correspond to a degree of information seeking behavior, wherein the information seeking behavior groups comprise (i) a first information seeking behavior group of first users associated with a first amount of information seeking behavior indicative of browsing more than a threshold amount of content items before interacting with a first content item and (ii) a second information seeking behavior group of second users associated with a second amount of information seeking behavior indicative of browsing less than the threshold amount of content items before interacting with a second content item;
estimating position biases for the information seeking behavior groups based upon information seeking behaviors of users within the information seeking behavior groups, wherein the estimating comprises identifying a first position bias for the first information seeking behavior group of the first users associated with the first amount of information seeking behavior and identifying a second position bias for the second information seeking behavior group of the second users associated with the second amount of information seeking behavior;
debiasing the set of training data using the position biases, comprising the first position bias for the first information seeking behavior group of the first users associated with the first amount of information seeking behavior and the second position bias for the second information seeking behavior group of the second users associated with the second amount of information seeking behavior, to generate a debiased set of training data; and
training a model using the debiased set of training data to generate a trained model, wherein the trained model is configured to output relevancy scores, between content items and the users, that account for the position biases estimated for the information seeking behavior groups.