US 12,235,905 B2
	Content display and clustering system
Fei Xiao, San Jose, CA (US); Ronica Jethwa, Mountain View, CA (US); Zidong Wang, San Jose, CA (US); Jing Lu, San Jose, CA (US); Jing Ye, San Jose, CA (US); Nam Vo, San Jose, CA (US); Jose Sanchez, San Jose, CA (US); Abhishek Bambha, Burlingame, CA (US); and Khaldun Aidarabsah, San Jose, CA (US)
Assigned to Roku, Inc., San Jose, CA (US)
Filed by Roku, Inc., San Jose, CA (US)
Filed on Feb. 7, 2024, as Appl. No. 18/435,171.
Application 18/435,171 is a continuation of application No. 17/943,526, filed on Sep. 13, 2022, granted, now 11,941,067.
Prior Publication US 2024/0214630 A1, Jun. 27, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/906 (2019.01); G06F 16/75 (2019.01); H04N 21/433 (2011.01); H04N 21/45 (2011.01)

CPC G06F 16/906 (2019.01) [G06F 16/75 (2019.01); H04N 21/4332 (2013.01); H04N 21/4532 (2013.01)]

20 Claims

1. A computer-implemented method for clustering a plurality of content items of a data set, comprising:

receiving, by at least one computer processor, a request to display the plurality of content items;

identifying the data set comprising the plurality of content items for clustering across a plurality of iterations for each of one or more levels, each level of clustering comprising a different similarity threshold;

performing for each of the one or more levels:

computing a similarity score for each of a plurality of pairs of content items;

identifying a subset of pairs from the plurality of pairs, wherein the similarity score, for each pair from the subset of pairs, exceeds a similarity threshold for a respective level;

clustering the subset of pairs, for each pair from the subset of pairs that exceed the similarity threshold for the respective level, into a clustered subset based on the similarity score; and

repeating the computing the similarity score, the identifying the subset, and the clustering the subset for each of the plurality of iterations for the respective level, for each subsequent iteration at the respective level;

identifying a final clustered subset comprising the clustered subset after each of the plurality of iterations for each of the one or more levels after the performing has been completed; and

outputting the final clustered subset for display, responsive to the request to display the plurality of content items.