US 12,450,928 B2
Resolution-based extraction of textual content from video of a communication session
Renjie Tao, Sunnyvale, CA (US)
Assigned to Zoom Communications, Inc., San Jose, CA (US)
Filed by Zoom Communications, Inc., San Jose, CA (US)
Filed on Jun. 4, 2022, as Appl. No. 17/832,640.
Prior Publication US 2023/0394858 A1, Dec. 7, 2023
Int. Cl. G06V 30/148 (2022.01); G06V 20/40 (2022.01); G06V 30/146 (2022.01)
CPC G06V 30/153 (2022.01) [G06V 20/46 (2022.01); G06V 20/49 (2022.01); G06V 30/147 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving video content of a communication session comprising a plurality of participants;
extracting high-resolution versions and low-resolution versions of frames from the video content;
classifying the low-resolution frames of the video content;
identifying one or more low-resolution distinguishing frames comprising text;
for each low-resolution distinguishing frame comprising text:
detecting a title within the frame,
cropping a title area with the title within the frame, and
extracting, via optical character recognition (OCR), the title from the cropped title area of the high-resolution version of the frame;
extracting, via OCR, textual content from the high-resolution versions and the low-resolution distinguishing frames comprising text; and
transmitting, to one or more client devices, the extracted textual content and the extracted titles.