US 12,424,237 B2
	Tag estimation device, tag estimation method, and program
Ryo Masumura, Tokyo (JP); and Tomohiro Tanaka, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Filed on May 7, 2024, as Appl. No. 18/657,584.
Application 18/657,584 is a continuation of application No. 17/279,009, granted, now 12,002,486, previously published as PCT/JP2019/036005, filed on Sep. 13, 2019.
Claims priority of application No. 2018-180018 (JP), filed on Sep. 26, 2018.
Prior Publication US 2024/0290344 A1, Aug. 29, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 25/30 (2013.01); G10L 15/02 (2006.01); G10L 15/10 (2006.01); G10L 25/48 (2013.01)

CPC G10L 25/30 (2013.01) [G10L 15/02 (2013.01); G10L 15/10 (2013.01); G10L 25/48 (2013.01)]

9 Claims

1. A tag estimation device comprising:

a hardware processor that:

generates, by a model, based on an input, a first utterance sequence information of an utterance in a sequence of utterances in a dialogue, wherein

the input comprises a combined utterance information of the utterance and a second utterance sequence information,

the combined utterance information of the utterance comprises added pieces of utterance word feature information of the utterance and speaker information of the utterance,

the second utterance sequence information comprises recursively added pieces of utterance word feature information of respective utterances in the sequence of utterances up to an immediately preceding utterance of the utterance in the sequence of utterances and pieces of speaker information of the respective utterances in the sequence of utterances up to the immediately preceding utterance of the utterance in the sequence of utterances,

the first utterance sequence information thereby represents recursively added pieces of utterance information of utterances up to the utterance in the sequence of utterances in the dialogue,

the utterance word feature information is associated with at least a word in the utterance spoken by a speaker, and

the speaker information of the utterance is associated with the speaker; and

determines a tag associated with the utterance, wherein the tag represents a result of analyzing the utterance from a predetermined model parameter and the first utterance sequence information.