CPC H04N 21/8549 (2013.01) [G11B 27/19 (2013.01); H04N 21/233 (2013.01); H04N 21/23418 (2013.01); H04N 21/812 (2013.01); H04N 21/8456 (2013.01)] | 20 Claims |
1. A system for computer vision analysis of a media item, comprising:
a computer processor;
a scene break detection service executing on the computer processor and comprising functionality to:
receive a request for scene break detection on the media item;
perform audio break detection on an audio component of the media item to obtain a set of audio break timestamps corresponding to aurally similar segments of the audio component;
identify a set of video break timestamps, each corresponding to at least one frame of a video component of the media item;
identify a set of candidate scene break timestamps corresponding to instances of the set of the audio break timestamps and the set of video break timestamps within a predefined proximity;
execute a computer vision scoring model for each candidate scene break timestamp of the set of candidate scene break timestamps by:
identifying a first subset of the set of contiguous shots preceding the candidate scene break timestamp and a second subset of the set of contiguous shots succeeding the candidate scene break timestamp;
calculating a score for the candidate scene break timestamp representing a visual distance between the first subset of contiguous shots and the second subset of contiguous shots; and
select, based at least on the score of each of the set of candidate scene break timestamps, a final set of scene break timestamps for performing a media action.
|