US 12,406,675 B2
Multi-user warm words
Matthew Sharifi, Kilchberg (CH); and Victor Carbune, Zürich (CH)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Nov. 17, 2022, as Appl. No. 18/056,697.
Prior Publication US 2024/0169995 A1, May 23, 2024
Int. Cl. G10L 17/00 (2013.01); G06F 3/16 (2006.01); G10L 15/00 (2013.01); G10L 15/08 (2006.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01); G10L 17/22 (2013.01)
CPC G10L 17/22 (2013.01) [G06F 3/167 (2013.01); G10L 15/22 (2013.01); G10L 15/00 (2013.01); G10L 2015/088 (2013.01); G10L 15/16 (2013.01); G10L 2015/223 (2013.01); G10L 17/00 (2013.01)] 32 Claims
OG exemplary drawing
 
1. A computer-implemented method when executed by data processing hardware causes the data processing hardware to perform operations comprising:
detecting a presence of multiple users within an environment of an assistant-enabled device (AED), the AED executing a digital assistant;
for each user of the multiple users, obtaining a respective active set of warm words that each specify a respective action for the digital assistant to perform;
based on the respective active set of warm words for each user of the multiple users, executing a warm word arbitration routine to enable a final set of warm words for detection by the AED by:
for each respective warm word in the respective active set of warm words for each user of the multiple users, determining memory and processing resources required to execute a respective warm word model on the AED that is associated with the respective warm word;
identifying any shared warm words corresponding to warm words present in the respective active set of warm words for each of at least two of the multiple users; and
determining a number of warm words to enable in the final set of warm words for detection by the AED based on;
assigning, for inclusion in the final set of warm words, a higher priority to warm words identified as shared warm words; and
the memory and processing resources determined for each of the warm words in the respective active set of warm words for each user of the multiple users, wherein the number of warm words in the final set of warm words enabled for detection by the AED comprise warm words selected from the respective active set of warm words for at least one user of the multiple users detected within the environment of the AED; and
while the final set of warm words are enabled for detection by the AED:
receiving audio data corresponding to an utterance captured by the AED;
detecting, in the audio data, a warm word from the final set of warm words using the respective warm word model executing on the AED that is associated with the warm word without performing speech recognition on the audio data; and
instructing the digital assistant to perform the respective action specified by the detected warm word.