US 12,067,779 B1
Contrastive learning of scene representation guided by video similarities
Shixing Chen, Kirkland, WA (US); Xiang Hao, Kenmore, WA (US); Xiaohan Nie, Lynnwood, WA (US); and Muhammad Raffay Hamid, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Feb. 9, 2022, as Appl. No. 17/668,014.
Int. Cl. G06V 20/40 (2022.01); G06V 10/774 (2022.01)
CPC G06V 20/48 (2022.01) [G06V 10/774 (2022.01); G06V 20/46 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A computing system comprising:
one or more processors; and
one or more memories having stored therein instructions that, upon execution by the one or more processors, cause the computing system to perform computing operations comprising:
determining, based on one or more similarity information types, a plurality of similar movie pairs, each similar movie pair of the plurality of similar movie pairs including a first respective movie and a second respective movie, wherein the first respective movie and the second respective movie are similar to one another based on the one or more similarity information types, and wherein the one or more similarity information types comprise at least one of movie genre information, movie synopsis information, or movie recommendation information;
determining, for each similar movie pair of the plurality of similar movie pairs, one or more similar scene pairs, each of the one or more similar scene pairs including a respective first scene from the first respective movie and a second respective scene from the second respective movie;
training, using a contrastive learning model that contrasts a plurality of similar scene pairs with a plurality of random scenes, a scene representation encoder, wherein the plurality of similar scene pairs includes the one or more scene pairs for each similar movie pair of the plurality of similar movie pairs; and
determining, using the scene representation encoder, one or more scene features of one or more other scenes of one or more other movies.