US 11,947,586 B2
	Video processing optimization and content searching
Wenchao Sun, Saratoga, CA (US); Dru Kingston Borden, Boulder, CO (US); Tsz-Yam Lau, Hayward, CA (US); and Shi-Rong Chang, San Jose, CA (US)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Nov. 22, 2021, as Appl. No. 17/532,686.
Claims priority of provisional application 63/216,454, filed on Jun. 29, 2021.
Prior Publication US 2022/0414138 A1, Dec. 29, 2022
Int. Cl. G06F 16/00 (2019.01); G06F 16/41 (2019.01); G06F 16/435 (2019.01); G06F 16/438 (2019.01); G06F 16/48 (2019.01); G06V 10/46 (2022.01); G06V 10/56 (2022.01); G06V 20/40 (2022.01); G06V 30/19 (2022.01); G10L 15/26 (2006.01); G10L 15/30 (2013.01)

CPC G06F 16/435 (2019.01) [G06F 16/41 (2019.01); G06F 16/438 (2019.01); G06F 16/48 (2019.01); G06V 10/46 (2022.01); G06V 10/56 (2022.01); G06V 20/49 (2022.01); G06V 30/191 (2022.01); G10L 15/26 (2013.01); G10L 15/30 (2013.01)]

22 Claims

1. A computer-implemented method, comprising:

receiving audiovisual content comprising a plurality of video frames and an audio recording;

determining, for a first frame of the plurality of video frames, a first score based at least in part on a set of visual characteristics of the frame;

determining whether a difference between the first score and a second score for a second frame of the plurality of video frames is above a threshold;

based at least on a determination that the difference, between the first score and the second score, is above a threshold:

segmenting the plurality of video frames into a plurality of scenes, a first scene of the plurality of scenes comprising a particular subset of the plurality of video frames, the particular subset comprising a plurality of video frames;

selecting a third frame of the first scene as a representative frame for the particular subset of video frames to extract one or more textual characters, wherein the one or more textual characters are present in two or more video frames of the particular subset of video frames in the first scene;

identifying, in the third frame, the one or more textual characters associated with the first scene for storing into a searchable database;

storing the identified one or more textual characters into the searchable database in association with the third frame and in association with a remainder of the video frames of the subset of the first scene; and

indexing at least one of the stored one or more textual characters to create an index.