| CPC G10L 15/22 (2013.01) [G10L 2015/223 (2013.01)] | 20 Claims |

|
13. A system, comprising:
at least one processor; and
at least one memory comprising instructions that, when executed by the at least one processor, cause the system to:
receive, from a first device, first input data representing a first user input;
process the first input data to determine a first intent of the first user input;
determine a first skill component configured to generate a first response to the first user input based on the first intent;
determine first skill configuration data associated with the first skill component, the first skill configuration data indicating at least one display capability for outputting first visual content of the first skill component;
determine, based on the first skill configuration data associated with the first skill component, a second device usable to present the first visual content of the first skill component;
send the first intent to the first skill component;
receive, from the first skill component and in response to sending the first intent:
a first portion of the first response to be output using at least one speaker; and
a second portion of the first response to be output using a display;
cause the first device to present the first portion of the first response; and
cause the second device to present the second portion of the first response.
|