US 12,002,460 B2
Information processing device, information processing system, and information processing method, and program
Chiaki Miyazaki, Tokyo (JP); Juri Yaeda, Tokyo (JP); and Saki Yokoyama, Tokyo (JP)
Assigned to SONY GROUP CORPORATION, Tokyo (JP)
Appl. No. 17/309,555
Filed by SONY GROUP CORPORATION, Tokyo (JP)
PCT Filed Oct. 10, 2019, PCT No. PCT/JP2019/039978
§ 371(c)(1), (2) Date Jun. 4, 2021,
PCT Pub. No. WO2020/121638, PCT Pub. Date Jun. 18, 2020.
Claims priority of application No. 2018-233645 (JP), filed on Dec. 13, 2018.
Prior Publication US 2022/0020369 A1, Jan. 20, 2022
Int. Cl. G10L 15/22 (2006.01); G06F 40/289 (2020.01); G10L 15/26 (2006.01); G10L 15/30 (2013.01)
CPC G10L 15/22 (2013.01) [G06F 40/289 (2020.01); G10L 15/26 (2013.01); G10L 15/30 (2013.01)] 12 Claims
OG exemplary drawing
 
1. An information processing device comprising:
a central processing unit (CPU) configured to:
determine an utterance type of a user utterance;
generate a system response according to a type determination result associated with the determined utterance type of the user utterance,
wherein the type determination result associated with the determined utterance type of the user utterance includes:
a (Type A) user utterance that requests all reutterances of a system utterance immediately before the user utterance,
a (Type B) user utterance that requests a reutterance of a part of the system utterance immediately before the user utterance, and
a (Other type) user utterance of a task request type;
generate a system response to reutter all system utterances immediately before the user utterance in a case where the user utterance is of the type A;
generate a system response to reutter a part of system utterances immediately before the user utterance in a case where the user utterance is of the type B; and
generate a system response as a result of executing a requested task of the user utterance in a case where the user utterance is of the other type,
wherein the requested task is determined based on:
a system utterance immediately before the user utterance, and
a past dialog history including at least one of: a set of past user utterances, a set of past system responses corresponding to the set of user utterances, and a set of past execution results corresponding to the set of user utterances.