US 11,721,327 B2
Generating representations of acoustic sequences
Hasim Sak, Santa Clara, CA (US); and Andrew W. Senior, New York, NY (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Jan. 8, 2021, as Appl. No. 17/145,208.
Application 17/145,208 is a continuation of application No. 16/704,799, filed on Dec. 5, 2019, granted, now 10,923,112.
Application 16/704,799 is a continuation of application No. 16/179,801, filed on Nov. 2, 2018, granted, now 10,535,338, issued on Jan. 14, 2020.
Application 16/179,801 is a continuation of application No. 15/664,153, filed on Jul. 31, 2017, granted, now 10,134,393, issued on Nov. 20, 2018.
Application 15/664,153 is a continuation of application No. 14/559,113, filed on Dec. 3, 2014, granted, now 9,721,562, issued on Aug. 1, 2017.
Claims priority of provisional application 61/917,089, filed on Dec. 17, 2013.
Prior Publication US 2021/0134275 A1, May 6, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/00 (2013.01); G10L 15/16 (2006.01); G10L 15/02 (2006.01); G10L 15/14 (2006.01)
CPC G10L 15/16 (2013.01) [G10L 15/02 (2013.01); G10L 15/142 (2013.01); G10L 2015/025 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method when executed on data processing hardware causes the data processing hardware to perform operations comprising:
receiving, as input to a long short-term memory (LSTM) neural network, a training acoustic feature representation;
generating, as an output from the LSTM neural network, a probability distribution over possible phoneme subdivisions by processing the training acoustic feature representation;
identifying, from the probability distribution over possible phoneme subdivisions, a threshold number of highest-scoring phoneme subdivisions; and
executing a backpropagation through time training process to determine trained values of parameters of the LSTM neural network using the threshold number of highest-scoring phoneme subdivisions identified from the probability distribution over possible phoneme subdivisions.