| CPC G10L 15/16 (2013.01) [G10L 15/02 (2013.01); G10L 15/063 (2013.01); G10L 15/30 (2013.01); G10L 2015/022 (2013.01)] | 20 Claims |

|
1. A computer-implemented method comprising:
generating a set of language data candidates, each language data candidate comprising one or more graphemes, by processing a sequence of phonemes related to user-provided domain-specific input speech data using an artificial intelligence-based data conversion model comprising a neural network model sharing a prediction network from a recurrent neural network transducer;
determining, for a target pair of one or more phonemes and one or more graphemes, a subset of graphemes from the set of language data candidates;
training at least one biasing language model using at least a portion of the subset of graphemes;
generating a first speech recognition output by processing the at least a portion of the subset of graphemes using the at least one biasing language model and an artificial intelligence-based speech recognition model comprising the recurrent neural network transducer, including the prediction network shared by the artificial intelligence-based data conversion model;
generating a second speech recognition output by replacing at least a portion of the subset of graphemes in the first speech recognition output with at least one of the one or more graphemes from the target pair; and
performing one or more automated actions based at least in part on the second speech recognition output;
wherein the method is carried out by at least one computing device.
|