US 12,279,096 B2
Systems and methods for associating playback devices with voice assistant services
Sein Woo, Somerville, MA (US); and John G. Tolomei, Renton, WA (US)
Assigned to Sonos, Inc., Santa Barbara, CA (US)
Filed by Sonos, Inc., Santa Barbara, CA (US)
Filed on May 8, 2023, as Appl. No. 18/313,859.
Application 18/313,859 is a continuation of application No. 17/446,690, filed on Sep. 1, 2021, granted, now 11,696,074.
Application 17/446,690 is a continuation of application No. 16/876,493, filed on May 18, 2020, granted, now 11,197,096, issued on Dec. 7, 2021.
Application 16/876,493 is a continuation of application No. 16/022,662, filed on Jun. 28, 2018, granted, now 10,681,460, issued on Jun. 9, 2020.
Prior Publication US 2023/0353942 A1, Nov. 2, 2023
Int. Cl. H04R 3/12 (2006.01); G06F 3/16 (2006.01); G10L 15/22 (2006.01); G10L 15/28 (2013.01); H04R 27/00 (2006.01)
CPC H04R 3/12 (2013.01) [G06F 3/165 (2013.01); G10L 15/22 (2013.01); G10L 15/28 (2013.01); H04R 27/00 (2013.01); G10L 2015/223 (2013.01); H04R 2227/003 (2013.01); H04R 2227/005 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A media playback system comprising:
one or more processors;
a first network microphone device (NMD);
a second NMD; and
one or more tangible, non-transitory, computer-readable media storing instructions executable by the one or more processors to cause the media playback system to perform operations comprising:
associating the first NMD with a first voice assistant service (VAS) such that the first NMD includes a first wake-word engine configured to detect a first wake word associated with the first VAS;
associating the second NMD with a second VAS such that the second NMD includes a second wake-word engine configured to detect a second wake word associated with the second VAS;
detecting a first voice input via the first NMD;
based on the first voice input, playing back first media content via the first NMD;
while playing back the first media content via the first NMD, detecting a second voice input via the second NMD;
based on the second voice input, playing back second media content via both the first NMD and the second NMD in synchrony with one another, and
transmitting a control state variable associated with at least the first NMD and the first VAS.
 
8. A method to be performed by a media playback system comprising at least a first network microphone device (NMD) and a second NMD, the method comprising:
associating the first NMD with a first voice assistant service (VAS) such that the first NMD includes a first wake word engine configured to detect a first wake word associated with the first VAS;
associating the second NMD with a second VAS such that the second NMD includes a second wake-word engine configured to detect a second wake word associated with the second VAS;
detecting a first voice input via the first NMD;
based on the first voice input, playing back first media content via the first NMD;
while playing back the first media content via the first NMD, detecting a second voice input via the second NMD;
based on the second voice input, playing back second media content via both the first NMD and the second NMD in synchrony with one another, and
transmitting a control state variable associated with at least the first NMD and the second NMD to one or more remote computing devices associated with the first VAS.
 
15. One or more tangible, non-transitory computer-readable media having instructions stored thereon that are executable by one or more processors to cause a media playback system to perform functions, the media playback system comprising at least a first network microphone device (NMD) and a second NMD, the functions comprising:
associating the first NMD with a first voice assistant service (VAS) such that the first NMD includes a first wake-word engine configured to detect a first wake word associated with the first VAS;
associating the second NMD with a second VAS such that the second NMD includes a second wake-word engine configured to detect a second wake word associated with the second VAS;
detecting a first voice input via the first NMD;
based on the first voice input, playing back a first media content via the first NMD;
while playing back the first media content via the first NMD, detecting a second voice input via the second NMD;
based on the second voice input, playing back a second media content via both the first NMD and the second NMD in synchrony with one another, and
transmitting a control state variable associated with at least the first NMD and the second NMD to one or more remote computing devices associated with the first VAS.