US 12,342,102 B2
	Systems and methods for managing captions
Jae Woo Chang, Cupertino, CA (US); Elizabeth C. Cranfill, San Francisco, CA (US); Pani Page, Las Vegas, NV (US); Christoper J. Romney, San Jose, CA (US); and Marcel Van Os, Santa Cruz, CA (US)
Assigned to Apple Inc., Cupertino, CA (US)
Filed by Apple Inc., Cupertino, CA (US)
Filed on Nov. 16, 2022, as Appl. No. 17/988,571.
Claims priority of provisional application 63/343,075, filed on May 17, 2022.
Claims priority of provisional application 63/281,373, filed on Nov. 19, 2021.
Prior Publication US 2023/0164296 A1, May 25, 2023
Int. Cl. H04N 7/15 (2006.01); G06F 3/0485 (2022.01); G06F 3/0486 (2013.01); G06F 3/0488 (2022.01); G06V 20/62 (2022.01); G10L 15/26 (2006.01); H04N 7/088 (2006.01); H04N 7/14 (2006.01)

CPC H04N 7/152 (2013.01) [G06F 3/0485 (2013.01); G06F 3/0486 (2013.01); G06F 3/0488 (2013.01); G06V 20/635 (2022.01); G10L 15/26 (2013.01); H04N 7/0885 (2013.01); H04N 7/147 (2013.01)]

54 Claims

1. A computer system configured to communicate with a display generation component and one or more input devices, comprising:

one or more processors; and

memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for:

displaying, via the display generation component, a live communication user interface, the live communication user interface corresponding to a live communication session, the live communication user interface including:

one or more representations of one or more participants of the live communication session, wherein a first representation of the one or more representations of the one or more participants is displayed at a location and at a size in the live communication user interface; and

a first caption in a first area of the live communication user interface, the first caption corresponding to a first portion of audio data of the live communication session, wherein the first caption is displayed without displaying a second caption corresponding to a second portion of audio data of the live communication session that is different from the first portion of audio data of the live communication session;

while displaying the live communication user interface with the first caption in the first area of the live communication user interface, detecting, from a local user of the computer system via the one or more input devices, an input that corresponds to a request to display expanded caption information; and

in response to detecting the input that corresponds to a request to display expanded caption information:

displaying, via the display generation component, the second caption corresponding to the second portion of audio data of the live communication session, wherein the second caption is displayed at a second area of the live communication user interface; and

modifying, via the display generation component, the location of the first representation and/or the size of the first representation in the live communication user interface.