US 11,755,545 B2
Methods and apparatus to estimate audience measurement metrics based on users represented in bloom filter arrays
Michael Sheppard, Holland, MI (US); Jake Ryan Dailey, San Francisco, CA (US); Dongbo Cui, New York, NY (US); Jonathan Sullivan, Hurricane, UT (US); Diane Morovati Lopez, West Hills, CA (US); Christie Nicole Summers, Baltimore, MD (US); and Molly Poppie, Arlington Heights, IL (US)
Assigned to The Nielsen Company (US), LLC, New York, NY (US)
Filed by The Nielsen Company (US), LLC, New York, NY (US)
Filed on Jul. 31, 2020, as Appl. No. 16/945,055.
Prior Publication US 2022/0036390 A1, Feb. 3, 2022
Int. Cl. G06F 16/20 (2019.01); G06F 16/435 (2019.01)
CPC G06F 16/20 (2019.01) [G06F 16/435 (2019.01)] 36 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a communications interface to:
receive a first Bloom filter array from a first computer of a first database proprietor, the first Bloom filter array representative of first users who accessed media, the first users registered with the first database proprietor, the first Bloom filter array including a first array of first elements, values of respective ones of the first elements being either a 0 or a 1 based on whether quantities of the first users allocated to the respective ones of the first elements are even or odd; and
receive a second Bloom filter array from the first computer of the first database proprietor, the second Bloom filter array representative of the first users who accessed media, the second Bloom filter array including a second array of second elements, the first users allocated to ones of the first elements of the first array based on a first hash function and allocated to ones of the second elements of the second array based on a second hash function different than the first hash function; and
a Bloom filter array analyzer to:
determine a first count of the first elements with a value of 1;
determine a second count of the second elements with a value of 1; and
estimate a first cardinality for the first Bloom filter array based on an average of the first and second counts, the first cardinality indicative of a total number of the first users who accessed the media.