CPC G10L 15/22 (2013.01) [G06F 18/214 (2023.01); G06V 10/82 (2022.01); G06V 20/50 (2022.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/18 (2013.01); G10L 15/24 (2013.01)] | 33 Claims |
12. An electronic device comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for:
receiving an image;
generating, based on the image, a question corresponding to a first object in the image;
retrieving, a plurality of speech recognition results based on a received utterance;
determining whether an attribute of an object referenced by a speech recognition result of the plurality of speech recognition results matches an attribute of an object referenced by the generated question; and
in accordance with a determination that the attribute of the object referenced by the speech recognition result of the plurality of speech recognition results matches the attribute of the object referenced by the generated question, determining that the received utterance is directed to a digital assistant.
|