| CPC G10L 15/22 (2013.01) [G06F 3/165 (2013.01); G06F 3/167 (2013.01); H04N 21/42203 (2013.01); G10L 2015/223 (2013.01)] | 20 Claims |

|
1. A first network microphone device comprising:
a network interface;
at least one first microphone;
at least one processor; and
at least one non-transitory computer-readable medium comprising instructions that are executable by the at least one processor such that the first network microphone device is configured to:
receive media content comprising audio;
provide a sound data stream representing the audio to a first wake-word engine, wherein the first wake-word engine is operable to generate a first wake-word response when the first wake-word engine detects a particular wake word in a first microphone sound data stream representing first sound detected by the at least one first microphone;
stream, via the network interface, one or more first audio signals representing a first portion of the audio to one or more playback devices for playback;
detect, via the first wake-word engine, that a second portion of the audio includes sound data matching the particular wake word;
before the second portion of the audio is played back by the one or more playback devices, cause, via the network interface, a second network microphone device to temporarily disable a wake-word response of a second wake-word engine, wherein the second wake-word engine is operable to (a) generate a second wake-word response when the second wake-word engine detects the particular wake word in a second microphone sound data stream representing second sound detected by at least one second microphone and (b) send sound data representing the second sound detected by the at least one second microphone to a voice assistant when the second wake-word response is generated; and
stream, via the network interface, one or more second audio signals representing the second portion of the audio to the one or more playback devices for playback.
|