| CPC H04N 21/4312 (2013.01) [G06F 3/013 (2013.01); G06F 3/017 (2013.01); G06F 3/1431 (2013.01); G06F 3/167 (2013.01); G10L 15/1822 (2013.01)] | 43 Claims |

|
1. A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device having a display, cause the electronic device to:
while displaying, on the display, a video event:
receive, by a digital assistant operating on the electronic device, a first natural language speech input corresponding to a first participant of the video event;
detecting a user gesture input;
in accordance with receiving the first natural language speech input, identify, by the digital assistant, based on context information associated with the video event, a first location of the first participant, including:
in accordance with a determination that the first natural language speech input refers to the first participant in the present tense, identifying the first location of the first participant as a location corresponding to the user gesture input when a portion of the first natural language speech input is received; and
in accordance with identifying the first location of the first participant, augment, by the digital assistant, the display of the video event with a first graphical overlay displayed at a first display location corresponding to the first location of the first participant.
|