CPC G10L 15/22 (2013.01) [G06F 3/165 (2013.01); G10L 15/08 (2013.01); G10L 2015/088 (2013.01); G10L 2015/223 (2013.01)] | 20 Claims |
1. A system comprising:
one or more microphones;
one or more processors;
at least one non-transitory computer-readable medium; and
program instructions stored on the at least one non-transitory computer-readable medium that, when executed by the one or more processors, cause the system to perform operations comprising:
capturing audio input via the one or more microphones;
monitoring, via a first wake-word detector, the captured audio input for a first wake word associated with a first voice assistant service (VAS);
monitoring, via a second wake-word detector, the captured audio input for a second wake word associated with a second VAS different from the first VAS;
detecting, via the first wake-word detector, the first wake word in the captured audio input;
after detecting the first wake word, obtaining, based on the captured audio input, first content to be played back via a playback device;
playing back, via the playback device, the first content;
while playing back the first content, detecting, via the second wake-word detector, the second wake word in the captured audio input;
after detecting the second wake word:
suspending operation of the first wake-word detector;
obtaining, based on the captured audio input, second content to be played back via the playback device; and
suppressing playback of the first content;
while the playback of the first content is suppressed, playing back, via the playback device, the second content; and
after playing back the second content:
resuming playback of the first content; and
resuming monitoring the captured audio input for the first wake word.
|