US 11,954,432 B2
	Symbol sequence generation apparatus, text compression apparatus, symbol sequence generation method and program
Hidetaka Kamigaito, Tokyo (JP); Masaaki Nagata, Tokyo (JP); and Tsutomu Hirao, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Appl. No. 16/976,932
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
PCT Filed Feb. 14, 2019, PCT No. PCT/JP2019/005283 § 371(c)(1), (2) Date Aug. 31, 2020, PCT Pub. No. WO2019/167642, PCT Pub. Date Sep. 6, 2019.
Claims priority of application No. 2018-037919 (JP), filed on Mar. 2, 2018.
Prior Publication US 2021/0012069 A1, Jan. 14, 2021
Int. Cl. G06F 40/20 (2020.01); G06F 40/47 (2020.01); G06N 3/04 (2023.01)

CPC G06F 40/20 (2020.01) [G06F 40/47 (2020.01); G06N 3/04 (2013.01)]

20 Claims

1. A computer-implemented method for processing a set of symbol data in a sentence, the method comprising:

receiving a first sequence of symbol data, the first sequence of symbol data representing an input sentence, wherein the first sequence of symbol data includes first symbol data;

generating, based on encoding one or more symbol data of the received first sequence of symbol data, a first hidden state of a neural network;

retrieving a predetermined set of symbol data from a memory, wherein the predetermined set of symbol data includes a dependency structure tree of symbol data, and the dependency structure tree of symbol data include first symbol data in the first sequence of symbol data and another symbol data as a parent of the symbol data in the dependency structure tree;

generating, based on weighting the generated first hidden state, a second hidden state of the neural network, wherein the weighting is based relates at least on a probability of the first symbol data in the first sequence of symbol data being distinct from said another symbol data as the parent in the dependency structure tree of symbol data, and the weighting is based on training attention data of symbol data using a sequence of training weight data according to a predetermined dependent tree of symbol data for training;

generating a third hidden state of the neural network based at least on a combination of:

the first symbol data in the first sequence of symbol data,

a second symbol data in a second sequence of symbol data, the second symbol data preceding the first symbol data in a sequence of symbol data, and

the generated second hidden state;

generating, based on the second hidden state and the third hidden state, a third symbol data in the second sequence of symbol data, wherein the third symbol data is subsequent to the second symbol data;

generating the second sequence of symbol data, wherein the second sequence of symbol data represents a sequence of labels for removing one or more symbol data in the first sequence of symbol data;

generating an output sequence of symbol data based on the removal of the one or more elements from the first sequence of symbol data according to the generated second sequence of symbol data; and

transmitting the output sequence of symbol data to an application configured to display an output sentence according to the output sequence of symbol data.