US 12,437,010 B2
	Automatic techniques for constructing an evolving interest taxonomy from user-generated content
Jason Brewer, Mountain View, CA (US); Shuo Han, Milpitas, CA (US); Chang Kuang Huang, Cupertino, CA (US); James Li, Mountain View, CA (US); Yiwei Ma, Santa Monica, CA (US); Manish Malik, Cupertino, CA (US); Yinan Na, Mountain View, CA (US); Dan Xie, Mountain View, CA (US); Jinchao Ye, New York, NY (US); Lili Zhang, Redwood City, CA (US); Mingtao Zhang, Los Angeles, CA (US); Yining Zhang, Seattle, WA (US); Hangqi Zhao, Bothell, WA (US); Ding Zhou, Los Altos Hills, CA (US); and Yang Zhou, San Francisco, CA (US)
Assigned to Snap Inc., Santa Monica, CA (US)
Filed by Snap Inc., Santa Monica, CA (US)
Filed on Feb. 8, 2024, as Appl. No. 18/436,945.
Prior Publication US 2025/0258878 A1, Aug. 14, 2025
Int. Cl. G06F 16/953 (2019.01); G06F 16/9532 (2019.01); G06F 16/9535 (2019.01)

CPC G06F 16/9532 (2019.01) [G06F 16/9535 (2019.01)]

18 Claims

1. A computer-implemented method comprising:

obtaining a plurality of content items from a plurality of content delivery sources of an online platform, each content delivery source having respective content items in a different type of content format;

applying distinct preprocessing steps to the respective content items from each content delivery source to extract text from the plurality of content items, wherein the distinct preprocessing steps are specific to the different content format of each content delivery source and include separate preprocessing pipelines for processing different types of content formats, wherein the preprocessing steps comprise source-specific preprocessing tailored to each type of content format and generalized source-agnostic preprocessing;

identifying, from the text extracted from the plurality of content items, a plurality of keywords and key phrases;

constructing an interest graph from the plurality of keywords and key phrases, the interest graph having as nodes, wherein each node is a vector representation of a keyword or key phrase, and having edges connecting the nodes, wherein each of the edges represents a measure of similarity between two connected nodes, wherein the constructing of the interest graph comprises providing the plurality of keywords and key phrases as input to an embedding model that outputs vector representations in a common embedding space based on co-engagement patterns determined from user behavior, wherein the co-engagement patterns are identified by detecting when a user interacts with two of the plurality of content items including one of the keywords or key phrases, such that keywords and key phrases that are similar to one another based on the co-engagement patterns have vector representations that are closer in distance to one another;

preparing a training dataset for a pairwise machine learning model by pairing each image and video content item with one or more relevant keywords or key phrases from the constructed interest graph; and

training a pairwise machine learning model on the prepared training dataset, wherein the training involves minimizing a contrastive loss function that brings closer together embeddings of corresponding image and text pairs from the training dataset, while pushing apart embeddings of non-corresponding image and text pairs.