US 11,694,093 B2
Generation of training data to train a classifier to identify distinct physical user devices in a cross-device context
Christian Perez, Cambridge, MA (US); Eunyee Koh, San Jose, CA (US); Ashley Rosie Weiling Chen, Providence, RI (US); and Ankita Pannu, San Jose, CA (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on Mar. 14, 2018, as Appl. No. 15/920,934.
Prior Publication US 2019/0287025 A1, Sep. 19, 2019
Int. Cl. G06N 5/04 (2023.01); G06N 3/008 (2023.01)
CPC G06N 5/04 (2013.01) [G06N 3/008 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method to generate training data for training a classifier for identification of physical user devices, the method comprising:
receiving a plurality of browser cookie device records, each of which represents one of a first plurality of user authentications, wherein the browser cookie device records are generated using browser cookie data;
receiving a plurality of persistent identifier device records, each of which represents one of a second plurality of user authentications, wherein the persistent identifier device records are generated using an identifier that is more persistent than the browser cookie data;
identifying a persistent identifier device record pair that includes a first persistent identifier device record having first device information and a second persistent identifier device record having second device information, wherein it is assumed that the first and second device information both identify a single particular device;
identifying a first browser cookie device record pair that includes a first browser cookie device record having the first device information that was included in the first persistent identifier device record and a second browser cookie device record having the second device information that was included in the second persistent identifier device record;
identifying a second browser cookie device record pair that includes two browser cookie device records that, collectively, do not include both the first and second device information;
labeling the first browser cookie device record pair as corresponding to equivalent physical devices;
labeling the second browser cookie device record pair as corresponding to distinct physical devices; and
generating a training dataset that includes the labeled first browser cookie device record pair and the labeled second browser cookie device record pair.