US 12,450,928 B2
	Resolution-based extraction of textual content from video of a communication session
Renjie Tao, Sunnyvale, CA (US)
Assigned to Zoom Communications, Inc., San Jose, CA (US)
Filed by Zoom Communications, Inc., San Jose, CA (US)
Filed on Jun. 4, 2022, as Appl. No. 17/832,640.
Prior Publication US 2023/0394858 A1, Dec. 7, 2023
Int. Cl. G06V 30/148 (2022.01); G06V 20/40 (2022.01); G06V 30/146 (2022.01)

CPC G06V 30/153 (2022.01) [G06V 20/46 (2022.01); G06V 20/49 (2022.01); G06V 30/147 (2022.01)]

20 Claims

1. A method, comprising:

receiving video content of a communication session comprising a plurality of participants;

extracting high-resolution versions and low-resolution versions of frames from the video content;

classifying the low-resolution frames of the video content;

identifying one or more low-resolution distinguishing frames comprising text;

for each low-resolution distinguishing frame comprising text:

detecting a title within the frame,

cropping a title area with the title within the frame, and

extracting, via optical character recognition (OCR), the title from the cropped title area of the high-resolution version of the frame;

extracting, via OCR, textual content from the high-resolution versions and the low-resolution distinguishing frames comprising text; and

transmitting, to one or more client devices, the extracted textual content and the extracted titles.