CPC G10L 15/16 (2013.01) [G06N 3/08 (2013.01); G10L 15/02 (2013.01); G10L 15/063 (2013.01); G10L 15/22 (2013.01); G10L 25/30 (2013.01); G10L 2015/025 (2013.01); G10L 15/26 (2013.01)] | 20 Claims |
11. A system comprising:
data processing hardware; and
memory hardware in communication with the data processing hardware and storing instructions that when executed on the data processing hardware causes the data processing hardware to perform operations comprising:
obtaining an n-best list of decoded speech recognition hypotheses for a training utterance;
training, using a loss function having a minimum word error rate (MWER) criterion, a recurrent neural network model by determining a word error rate expectation for the training utterance that is restricted to the n-best list of decoded speech recognition hypotheses for the training utterance; and
generating, using the trained recurrent neural network model, a transcription for audio data indicating acoustic characteristics of an utterance.
|