| CPC G10L 15/32 (2013.01) [G10L 15/18 (2013.01); G10L 15/22 (2013.01); G10L 15/30 (2013.01)] | 17 Claims |

|
1. A system comprising:
at least one processor; and
memory storing instructions that, when executed, cause the at least one processor to be operable to:
receive audio data that captures a spoken utterance of a user, the audio data being generated by one or more microphones of a client device of the user, and the spoken utterance being directed to an automated assistant executed at least in part at the client device;
determine, based on processing the audio data, a plurality of first-party interpretations of the spoken utterance, each of the plurality of first-party interpretations being associated with a corresponding first-party predicted value indicative of a magnitude of confidence that each of the first-party interpretations are predicted to satisfy the spoken utterance;
identify a given third-party agent capable of satisfying the spoken utterance;
transmit, to the given third-party agent and over one or more networks, and based on processing the audio data, one or more structured requests that, when received, causes the given third-party to determine a plurality of third-party interpretations of the spoken utterance, each of the plurality of third-party interpretations being associated with a corresponding third-party predicted value indicative of a magnitude of confidence that each of the third-party interpretations are predicted to satisfy the spoken utterance;
receive, from the given third-party agent and over one or more of the networks, the plurality of third-party interpretations of the spoken utterance;
select, based on the corresponding first-party predicted values and the corresponding third-party predicted values, a given interpretation of the spoken utterance from among the plurality of first-party interpretations and the plurality third-party interpretations;
cause the given third-party agent to satisfy the spoken utterance based on the given interpretation of the spoken utterance
determine whether the given interpretation is one of the plurality of first-party interpretations or one of the plurality of third-party interpretations; and
in response to determining that the given interpretation is one of the plurality of first-party interpretations:
cause the automated assistant to provide, for presentation to the user of the client device, an indication that the given interpretation is one of the plurality of first-party interpretations; and
in response to determining that the given interpretation is one of the plurality of third-party interpretations:
cause the automated assistant to provide, for presentation to the user of the client device, an indication that the given interpretation is one of the plurality of third-party interpretations.
|