CPC G10L 15/22 (2013.01) [G10L 13/02 (2013.01); G10L 15/1815 (2013.01); G10L 15/30 (2013.01); G10L 15/32 (2013.01); G10L 2015/223 (2013.01)] | 20 Claims |
1. A computer-implemented method comprising:
receiving, from a voice-controlled device, first input audio data representing a command and a first indication of a first assistant requested for handling the command;
determining, in response to receiving the first indication, that a first command processing subsystem (CPS) corresponds to the first assistant;
performing speech processing on the first input audio data to determine first natural language understanding (NLU) result data including a first skill and a first intent;
determining a second CPS associated with the first skill;
sending, to a first component configured to generate data representing operations for responding to the command, a first identifier of the first CPS, the first NLU result data, and a second identifier of the second CPS;
sending, to the first component, policy data representing information related to interactions between the first CPS and second CPS;
determining, by the first component, output plan data corresponding to potential execution corresponding to the command, the output plan data comprising:
first plan data representing a handoff between the first CPS and the second CPS, the first plan data corresponding to:
a first message indicating the handoff,
a first operation to be executed by the second CPS in response to the command, and
a first score, and
second plan data representing termination of processing with regard to the command, the second plan data corresponding to:
a second message indicating the termination of processing, and
a second score;
sending the output plan data to a second component for augmenting the output plan data;
processing the first plan data using the second component to determine first text data corresponding to the first message;
processing the second plan data using the second component to determine second text data corresponding to the second message;
determining first augmented plan data comprising the first plan data and the first text data;
determining second augmented plan data comprising the second plan data and the second text data; and
based at least in part on the first score and the second score, causing further processing with regard to the first augmented plan data.
|