US 12,462,113 B2
Embedding an unknown word in an input sequence
Makoto Morishita, Tokyo (JP); Jun Suzuki, Tokyo (JP); Sho Takase, Tokyo (JP); Hidetaka Kamigaito, Tokyo (JP); and Masaaki Nagata, Tokyo (JP)
Assigned to NTT, Inc.
Appl. No. 17/610,589
Filed by NTT, Inc., Tokyo (JP)
PCT Filed May 21, 2019, PCT No. PCT/JP2019/020174
§ 371(c)(1), (2) Date Nov. 11, 2021,
PCT Pub. No. WO2020/235024, PCT Pub. Date Nov. 26, 2020.
Prior Publication US 2022/0215182 A1, Jul. 7, 2022
Int. Cl. G06F 40/44 (2020.01); G06F 40/30 (2020.01); G06N 3/045 (2023.01)
CPC G06F 40/44 (2020.01) [G06F 40/30 (2020.01); G06N 3/045 (2023.01)] 5 Claims
OG exemplary drawing
 
1. An information learning apparatus comprising:
a memory and a processor configured to:
generate, for each of processing units constituting an input sequence included in training data, a third embedded vector by adding a second embedded vector corresponding to an unknown word to a first embedded vector of each processing unit regardless of whether said each processing unit represents the unknown word or not;
execute, using training data, a process based on a learning target parameter of a sequence-to-sequence model to generate an inference result, with the third embedded vector generated for said each processing unit as an input; and
learn the learning target parameter of the sequence-to-sequence model, wherein the learning target parameter is based on an error between the inference result of the execution and a ground truth output corresponding to the input sequence in the training data, wherein a learnt neural network with the learnt learning target parameter converts an input statement into one or more output words, and the input statement comprises the unknown word.