US 12,217,736 B2
	Simultaneous acoustic event detection across multiple assistant devices
Matthew Sharifi, Kilchberg (CH); and Victor Carbune, Zurich (CH)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by GOOGLE LLC, Mountain View, CA (US)
Filed on Sep. 13, 2023, as Appl. No. 18/367,859.
Application 18/367,859 is a continuation of application No. 17/085,926, filed on Oct. 30, 2020, granted, now 11,798,530.
Prior Publication US 2023/0419951 A1, Dec. 28, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/00 (2013.01); G01S 3/80 (2006.01); G10L 15/01 (2013.01); G10L 15/08 (2006.01); G10L 15/32 (2013.01); H04R 29/00 (2006.01)

CPC G10L 15/01 (2013.01) [G01S 3/8006 (2013.01); G10L 15/08 (2013.01); G10L 15/32 (2013.01); H04R 29/006 (2013.01); G10L 2015/088 (2013.01)]

18 Claims

1. A method implemented by one or more processors, the method comprising:

detecting, via one or more microphones of an assistant device located in an ecosystem that includes a plurality of assistant devices, audio data that captures an acoustic event, wherein the acoustic event comprises a hotword detection event;

processing, using an event detection model that is stored locally at the assistant device, the audio data that captures the acoustic event to generate a measure associated with the acoustic event, wherein the event detection model that is stored locally at the assistant device comprises a hotword detection model that is trained to detect whether a particular word or phrase is captured in the audio data;

in response to detecting the audio data via the one or more microphones of the assistant device:

anticipating detection of additional audio data via one or more additional microphones of an additional assistant device based on a plurality of historical acoustic events being detected at both the assistant device and the additional assistant device, the additional assistant device being in addition to the assistant device, and the additional assistant device being co-located in the ecosystem with the assistant device;

detecting, via the one or more additional microphones of the additional assistant device located in the ecosystem, the additional audio data that also captures the acoustic event;

processing, using an additional event detection model that is stored locally at the additional assistant device, the additional audio data that captures the acoustic event to generate an additional measure associated with the acoustic event, wherein the additional event detection model that is stored locally at the additional assistant device comprises an additional hotword detection model that is trained to detect whether the particular word or phrase is captured in the additional audio data;

determining, based on the measure satisfying a threshold indicating that the particular word or phrase is captured in the audio data and based on the additional measure satisfying the threshold indicating that the particular word or phrase is captured in the additional audio data, that the acoustic event detected by at least both the assistant device and the additional assistant device corresponds to an occurrence of an actual acoustic event; and

in response to determining that the acoustic event corresponds to an occurrence of the actual acoustic event, causing one or more components of an automated assistant to be activated at one or more of: the assistant device, the additional assistant device, or a further additional assistant device.