US 11,869,496 B2
	Information processing device and information processing method, and information processing system
Masahiro Hara, Tokyo (JP); and Shinpei Kameoka, Tokyo (JP)
Assigned to SONY CORPORATION, Tokyo (JP)
Appl. No. 17/055,140
Filed by Sony Corporation, Tokyo (JP)
PCT Filed Apr. 11, 2019, PCT No. PCT/JP2019/015875 § 371(c)(1), (2) Date Nov. 13, 2020, PCT Pub. No. WO2019/225201, PCT Pub. Date Nov. 28, 2019.
Claims priority of application No. 2018-100418 (JP), filed on May 25, 2018.
Prior Publication US 2021/0217414 A1, Jul. 15, 2021
Int. Cl. G10L 15/22 (2006.01); G10L 13/00 (2006.01); G10L 15/18 (2013.01)

CPC G10L 15/22 (2013.01) [G10L 13/00 (2013.01); G10L 15/1815 (2013.01)]

3 Claims

1. An information processing device comprising:

control circuitry configured to:

receive, from a remote device, a voice input from a user;

perform a voice recognition and a semantic analysis of the voice input from the user to create a semantically analyzed voice input;

output:

first information related to the voice input to a first external agent device, and

second information related to the voice input to a second external agent device,

wherein the first information is formed by voice synthesizing the semantically analyzed voice input, and

wherein the second information is the semantically analyzed voice input;

receive:

a first reply to the voice input from the first external agent device,

a second reply to the voice input the second external agent device, and

a third reply to the voice input from a third external agent device that has independently received the voice input from the user,

wherein:

the first reply is a synthesized voice output from the first external agent device,

the second reply is a semantically analyzed reply that is output from the second external agent device, and

the third reply is a synthesized voice output from the third external agent device;

aggregate the first, second and third replies into an aggregated reply to the voice input; and

output the aggregated reply as a synthesized voice output to the remote device for relay to the user.