CPC G10L 15/22 (2013.01) [G06F 3/167 (2013.01); G06F 40/30 (2020.01); G10L 15/08 (2013.01); G10L 15/26 (2013.01); G10L 25/78 (2013.01); G10L 25/87 (2013.01); G10L 2015/088 (2013.01); G10L 2015/223 (2013.01); G10L 2025/783 (2013.01)] | 91 Claims |
1. An electronic device, comprising:
one or more processors;
a microphone; and
memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for:
receiving, via the microphone, a first audio stream including one or more utterances;
determining whether the first audio stream includes a lexical trigger;
in accordance with a determination that the first audio stream includes the lexical trigger, generating one or more candidate text representations of the one or more utterances;
determining whether at least one candidate text representation of the one or more candidate text representations is to be disregarded by the virtual assistant based on sensory data obtained from one or more sensors of the electronic device and a usage pattern of the virtual assistant associated with a time;
in accordance with a determination that at least one candidate text representation is to be disregarded by the virtual assistant, generating one or more candidate intents based on candidate text representations of the one or more candidate text representations other than the to be disregarded at least one candidate text representation;
determining whether the one or more candidate intents include at least one actionable intent;
in accordance with a determination that the one or more candidate intents include at least one actionable intent, executing the at least one actionable intent;
outputting a result of the execution of the at least one actionable intent.
|