| CPC G10L 15/1815 (2013.01) [G10L 15/22 (2013.01); G10L 21/0232 (2013.01); G10L 25/21 (2013.01); G10L 2015/088 (2013.01); G10L 2015/223 (2013.01); G10L 2021/02082 (2013.01)] | 20 Claims |

|
15. A tangible, non-transitory computer-readable medium storing instructions that, when executed by one or more processors of a playback device, cause the playback device to perform functions comprising:
capturing, via at least one microphone of the playback device, a voice input;
detecting at least one keyword within the voice input, wherein the at least one keyword is at least one of a plurality of command keywords supported by the playback device;
determining, via a local natural language unit (NLU) of the playback device, an intent based on the at least one keyword, wherein the NLU includes a pre-determined library of keywords comprising the at least one keyword;
evaluating the determined intent based at least in part on: (i) determining that a reduced-volume period within the voice input exceeds a predetermined time period; and (ii) determining that a rate of change of acoustic energy within the voice input exceeds a predetermined threshold level; and
based on the evaluation, performing a command in accordance with the determined intent.
|