CPC G10L 15/22 (2013.01) [G06F 3/165 (2013.01); G06F 3/167 (2013.01); G10L 15/1815 (2013.01); G10L 15/30 (2013.01); G10L 2015/088 (2013.01); G10L 2015/223 (2013.01)] | 20 Claims |
1. A playback device comprising:
a network interface;
at least one microphone configured to detect sound;
at least one speaker;
at least one processor; and
a housing carrying the network interface, the at least one microphone, the at least one speaker; the at least one processor, and data storage including instructions that are executable by the at least one processor such that the playback device is configured to:
capture, via the at least one microphone, at least one input data stream;
detect a wake word in a first portion of the at least one input data stream;
based on detection of the wake word, trigger a wake-word event based on a first voice input captured via the at least one microphone, wherein the first voice input comprises the wake word and an utterance, and wherein the wake word does not correspond to a command;
stream, via the network interface, sound data representing at least a portion of the first voice input to one or more remote servers of a voice assistant service for remote processing via a voice assistant of the one or more remote servers;
after the first voice input is processed, a first command keyword in a second portion of the at least one input data stream, wherein the first command keyword is preceded in the at least one input data stream by a period of inactivity that excludes the wake word;
based on detection of the first command keyword, trigger a first command keyword event to locally process a second voice input represented in the second portion of the at least one input data stream, wherein the second voice input comprises a first command keyword and at least one keyword from a set of keywords supported by a local voice assistant, wherein the first command keyword is one of a plurality of command keywords supported by the local voice assistant of the playback device, and wherein the second voice input excludes the wake word;
determine, via the local voice assistant, (i) a particular command corresponding to the first command keyword and (ii) one or parameters corresponding to the at least one keyword, the one or more parameters modifying the particular command; and
cause at least one local network device to carry out the particular command according to the one or more parameters.
|