US 11,971,872 B2
Generating user attribute verification scores to facilitate improved data validation from scaled data providers
Kathryn Ward Barnitt, Emeryville, CA (US); Nawid Sayed, Dossenheim (DE); Aditya Chaturvedi, Dublin (IE); Theodore Jacob Kornish, San Francisco, CA (US); Yacov Salomon, Danville, CA (US); and Scott Matthew McKinley, Mill Valley, CA (US)
Assigned to Truthset, Inc., San Francisco, CA (US)
Filed by Truthset, Inc., San Francisco, CA (US)
Filed on Sep. 15, 2021, as Appl. No. 17/475,856.
Claims priority of provisional application 63/188,382, filed on May 13, 2021.
Prior Publication US 2022/0374412 A1, Nov. 24, 2022
Int. Cl. G06F 16/23 (2019.01); G06F 3/04847 (2022.01); G06F 16/215 (2019.01); G06F 3/048 (2013.01)
CPC G06F 16/2365 (2019.01) [G06F 3/04847 (2013.01); G06F 16/215 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving user trait data from a plurality of data providers, wherein the plurality of data providers explicitly or implicitly collect the user trait data comprising user identifiers and corresponding user attributes;
receiving additional user trait data from one or more validation datasets;
determining a target user attribute associated with the user identifiers in the user trait data for the plurality of data providers;
determining, by a processor, for each data provider of the plurality of data providers, a user attribute accuracy rate associated with each data provider based on comparing ft the target user attribute for a plurality of user identifiers in the user trait data for each data provider and the target user attribute for the plurality of user identifiers in the one or more validation datasets to determine a frequency of matches for the target user attribute between each data provider and the one or more validation datasets;
generating, by the processor, for each user identifier having the target user attribute from the plurality of data providers, a target user attribute verification score based on:
determining a value of the target user attribute within the user trait data from each data provider of the plurality of data providers;
sampling user attribute accuracy rates from each provider of the plurality of data providers for the value of the target user attribute to generate a user attribute verification score distribution; and
determining an average user attribute verification score from the user attribute verification score distribution for the plurality of data providers for the target user attribute; and
generating, for display on an interactive graphical user interface on a client device, a user attribute verification score that comprises the target user attribute verification score for the target user attribute.