CPC G10L 15/197 (2013.01) [G10L 15/02 (2013.01); G10L 15/05 (2013.01); G10L 15/22 (2013.01); G10L 15/30 (2013.01)] | 20 Claims |
1. A system comprising:
a processing unit; and
a storage device including program code that when executed by the processing unit causes the system to:
collect a first batch comprising a first number of raw acoustic feature frames of the audio signal, the first number equal to a first batch size;
input the first batch to a speech recognition network;
in response to a word hypothesis output by the speech recognition network, collect a second batch comprising a second number of acoustic feature frames of the audio signal, the second number equal to a second batch size; and
input the second batch to the speech recognition network.
|