US 11,727,416 B2
Methods and apparatus to estimate large scale audience deduplication
Michael R. Sheppard, Holland, MI (US); Damien Forthomme, Seattle, WA (US); and Molly Poppie, Arlington Heights, IL (US)
Assigned to THE NIELSEN COMPANY (US), LLC, New York, NY (US)
Filed by The Nielsen Company (US), LLC, New York, NY (US)
Filed on Nov. 27, 2019, as Appl. No. 16/698,898.
Prior Publication US 2021/0158377 A1, May 27, 2021
Int. Cl. G06Q 10/06 (2023.01); G06Q 10/10 (2023.01); G06Q 30/02 (2023.01); G06Q 30/0201 (2023.01); G06F 16/22 (2019.01); G06F 16/2455 (2019.01); G06F 16/215 (2019.01); H04N 21/81 (2011.01); H04N 21/442 (2011.01)
CPC G06Q 30/0201 (2013.01) [G06F 16/215 (2019.01); G06F 16/2246 (2019.01); G06F 16/24558 (2019.01); H04N 21/44213 (2013.01); H04N 21/812 (2013.01)] 28 Claims
OG exemplary drawing
 
1. An apparatus comprising:
interface circuitry to collect, via a network, impression data indicative of accesses to a plurality of media items for a total audience size, the impression data including audience sizes for the plurality of media items, the total audience size including a computer-generated inaccuracy causing a misrepresentation of a deduplicated audience size of the plurality of media items;
association controller circuitry to generate a tree structure association for the total audience size that accessed the plurality of media items, the tree structure association including a first node representative of a first media item accessed by first audience members of the total audience size and a second node representative of a second media item accessed by second audience members of the total audience size;
matrix generator circuitry to improve functionality of a computer by generating a matrix without using processing power to solve a partial derivative equation for the first node or the second node, the matrix generator circuitry to generate the matrix by:
selecting a sum of probabilities value corresponding to the tree structure association, the sum of probabilities value representative of a probability of the first audience members accessing the first media item; and
storing the sum of probabilities value in an element of the matrix; and
commercial solver circuitry to increase an accuracy of the computer to reduce the computer-generated inaccuracy of the total audience size by estimating the deduplicated audience size of the total audience size using the matrix.