US 12,266,354 B2
	Speech interpretation based on environmental context
Brad Kenneth Herman, Culver City, CA (US); Shiraz Akmal, Playa Vista, CA (US); Aaron Mackay Burns, Sunnyvale, CA (US); and David A. Carson, San Francisco, CA (US)
Assigned to Apple Inc., Cupertino, CA (US)
Filed by Apple Inc., Cupertino, CA (US)
Filed on Oct. 13, 2021, as Appl. No. 17/500,518.
Claims priority of provisional application 63/222,333, filed on Jul. 15, 2021.
Prior Publication US 2023/0035941 A1, Feb. 2, 2023
Int. Cl. G10L 15/25 (2013.01); G06F 3/01 (2006.01); G10L 15/18 (2013.01)

CPC G10L 15/1815 (2013.01) [G06F 3/013 (2013.01); G10L 15/25 (2013.01)]

51 Claims

1. An electronic device, comprising:

one or more processors; and

memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions, which when executed, cause the electronic device to:

detect a user gaze direction, wherein the user gaze direction is associated with a first user of the electronic device;

receive, from a first user of the electronic device, a first speech input including first content;

in accordance with a determination that the user gaze direction associated with the first user is not directed at a displayed digital assistant object:

obtain contextual information associated with the electronic device, wherein the contextual information includes a second speech input from a second user, wherein the second speech input includes second content;

adjust a confidence value based on the first content and the second content;

determine, based on the contextual information and the confidence value, whether the first speech input is directed to the digital assistant of the electronic device; and

in accordance with a determination that the first speech input is directed to the digital assistant of the electronic device:

process, by the digital assistant, the first speech input.