US 11,790,937 B2
Voice detection optimization using sound metadata
Connor Kristopher Smith, New Hudson, MI (US); Kurt Thomas Soto, Ventura, CA (US); and Charles Conor Sleith, Waltham, MA (US)
Assigned to Sonos, Inc., Santa Barbara, CA (US)
Filed by Sonos, Inc., Santa Barbara, CA (US)
Filed on May 18, 2021, as Appl. No. 17/303,001.
Application 17/303,001 is a continuation of application No. 16/138,111, filed on Sep. 21, 2018, granted, now 11,024,331.
Prior Publication US 2021/0272586 A1, Sep. 2, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 25/84 (2013.01); G10L 21/0208 (2013.01); G10L 25/03 (2013.01); H04R 3/00 (2006.01)
CPC G10L 25/84 (2013.01) [G10L 21/0208 (2013.01); G10L 25/03 (2013.01); H04R 3/00 (2013.01); G10L 2021/02082 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A network microphone device (NMD) comprising:
one or more processors;
one or more microphones; and
data storage having instructions thereon that, when executed by the one or more processors, cause the NMD to perform operations comprising:
detecting sound via the one or more microphones;
capturing sound data based on the detected sound, wherein the sound data includes a voice input;
analyzing the sound data to detect a wake word;
capturing metadata associated with the sound data, wherein the voice input is not derivable from the metadata;
transmitting the sound data to one or more remote computing devices to determine an intent based on the voice input;
transmitting the metadata to at least one of the one or more remote computing devices to determine at least one characteristic of the detected sound based on the metadata;
after transmitting the metadata, receiving a response from the at least one of the one or more remote computing devices, wherein the response includes an instruction, based on the determined characteristic, to modify at least one performance parameter of the NMD;
modifying the at least one performance parameter based on the instruction, wherein the modifying comprises at least one of:
adjusting a fixed gain of the NMD;
adjusting a wake-word-detection sensitivity parameter of the NMD;
adjusting a noise-reduction parameter of the NMD;
adjusting an acoustic echo cancellation parameter of the NMD;
adjusting a spatial processing algorithm of the NMD;
adjusting a localization algorithm of the NMD; or
disregarding input from a defective microphone; and
performing a command based on the determined intent.