CPC G06F 16/435 (2019.01) [G06F 16/41 (2019.01); G06F 16/438 (2019.01); G06F 16/48 (2019.01); G06V 10/46 (2022.01); G06V 10/56 (2022.01); G06V 20/49 (2022.01); G06V 30/191 (2022.01); G10L 15/26 (2013.01); G10L 15/30 (2013.01)] | 22 Claims |
1. A computer-implemented method, comprising:
receiving audiovisual content comprising a plurality of video frames and an audio recording;
determining, for a first frame of the plurality of video frames, a first score based at least in part on a set of visual characteristics of the frame;
determining whether a difference between the first score and a second score for a second frame of the plurality of video frames is above a threshold;
based at least on a determination that the difference, between the first score and the second score, is above a threshold:
segmenting the plurality of video frames into a plurality of scenes, a first scene of the plurality of scenes comprising a particular subset of the plurality of video frames, the particular subset comprising a plurality of video frames;
selecting a third frame of the first scene as a representative frame for the particular subset of video frames to extract one or more textual characters, wherein the one or more textual characters are present in two or more video frames of the particular subset of video frames in the first scene;
identifying, in the third frame, the one or more textual characters associated with the first scene for storing into a searchable database;
storing the identified one or more textual characters into the searchable database in association with the third frame and in association with a remainder of the video frames of the subset of the first scene; and
indexing at least one of the stored one or more textual characters to create an index.
|