US 12,321,385 B1
Automated identification of labels for videos with a labeling service within a service provider network
Alex Williams, Johnson City, TN (US); Weifeng Chen, Redmond, WA (US); Patrick Guy Haffner, Atlantic Highlands, NJ (US); Matthew Alan Lease, Austin, TX (US); Li Erran Li, Palo Alto, CA (US); and Kumar Hemachandra Chellapilla, Mountain View, CA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 31, 2023, as Appl. No. 18/129,655.
Int. Cl. G06F 16/75 (2019.01); G06F 16/71 (2019.01); G06F 16/738 (2019.01); G06F 16/78 (2019.01); G06V 20/40 (2022.01)
CPC G06F 16/75 (2019.01) [G06F 16/71 (2019.01); G06F 16/738 (2019.01); G06F 16/7867 (2019.01); G06V 20/41 (2022.01)] 15 Claims
OG exemplary drawing
 
1. A method comprising:
receiving, from a client device at a labeling service of a service provider network, (i) a dataset of unlabeled videos and (ii) category tags related to content of the unlabeled videos for labeling the unlabeled videos;
analyzing, by a video modeling engine of the labeling service, the dataset to provide a set of ranked unlabeled videos from the dataset, wherein the set of ranked unlabeled videos are ranked according to confidence scores with respect to at least one tag;
based on the confidence scores, labeling at least some of the set of ranked unlabeled videos with the at least one tag to provide a dataset of labeled videos, wherein (i) a first ranked unlabeled video having a first confidence score at or above a first threshold is automatically labeled with the at least one tag, (ii) a second ranked unlabeled video having a second confidence score at or above a second threshold that is less than the first threshold is presented to an annotator for manual labeling with the at least one tag, and (iii) a third ranked unlabeled video having a third confidence score below the second threshold is presented to the annotator for manual labeling with a tag from the category tags;
verifying at least some labeled videos of the dataset of labeled videos with respect to the at least one tag to provide a verified dataset of labeled videos; and
storing the verified dataset of labeled videos in a database;
selecting, by the video modeling engine, a group-view mode interface configured to present multiple videos simultaneously to the annotator on a display;
presenting on the display, to the annotator by the video modeling engine using the group-view mode interface, a group of ranked unlabeled videos having confidence scores at or above the second threshold and below the first threshold; and
labeling, by the annotator interacting with the group-view mode interface, one or more of the group of ranked unlabeled videos with the at least one tag.