CPC G10L 15/16 (2013.01) [G10L 15/02 (2013.01); G10L 15/142 (2013.01); G10L 2015/025 (2013.01)] | 20 Claims |
1. A computer-implemented method when executed on data processing hardware causes the data processing hardware to perform operations comprising:
receiving, as input to a long short-term memory (LSTM) neural network, a training acoustic feature representation;
generating, as an output from the LSTM neural network, a probability distribution over possible phoneme subdivisions by processing the training acoustic feature representation;
identifying, from the probability distribution over possible phoneme subdivisions, a threshold number of highest-scoring phoneme subdivisions; and
executing a backpropagation through time training process to determine trained values of parameters of the LSTM neural network using the threshold number of highest-scoring phoneme subdivisions identified from the probability distribution over possible phoneme subdivisions.
|