CPC G06N 5/04 (2013.01) [G06N 3/008 (2013.01)] | 19 Claims |
1. A computer-implemented method to generate training data for training a classifier for identification of physical user devices, the method comprising:
receiving a plurality of browser cookie device records, each of which represents one of a first plurality of user authentications, wherein the browser cookie device records are generated using browser cookie data;
receiving a plurality of persistent identifier device records, each of which represents one of a second plurality of user authentications, wherein the persistent identifier device records are generated using an identifier that is more persistent than the browser cookie data;
identifying a persistent identifier device record pair that includes a first persistent identifier device record having first device information and a second persistent identifier device record having second device information, wherein it is assumed that the first and second device information both identify a single particular device;
identifying a first browser cookie device record pair that includes a first browser cookie device record having the first device information that was included in the first persistent identifier device record and a second browser cookie device record having the second device information that was included in the second persistent identifier device record;
identifying a second browser cookie device record pair that includes two browser cookie device records that, collectively, do not include both the first and second device information;
labeling the first browser cookie device record pair as corresponding to equivalent physical devices;
labeling the second browser cookie device record pair as corresponding to distinct physical devices; and
generating a training dataset that includes the labeled first browser cookie device record pair and the labeled second browser cookie device record pair.
|