CPC G10L 15/16 (2013.01) [G06N 3/045 (2023.01); G06N 3/084 (2013.01); G10L 15/063 (2013.01)] | 20 Claims |
1. A method comprising:
receiving, by one or more processors, an audio file comprising speech input;
processing, by a speech recognition engine, the audio file comprising the speech input to generate an initial character-based representation of the speech input, the initial character-based representation being generated based on an association between an individual point in time of the speech input with a first probability that the speech input corresponds to a first character of a plurality of individual characters and an association between the individual point in time of the speech input with a second probability that the speech input corresponds to a second character of the plurality of individual characters;
processing, by an intent classifier, the initial character-based representation of the speech input to generate an estimated intent of the speech input; and
generating, by the speech recognition engine, a textual representation of the speech input based on the estimated intent of the speech input.
|