| CPC G10L 15/01 (2013.01) [G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/197 (2013.01)] | 10 Claims |

|
1. A speech recognition model generating device for generating an end-to-end (E2E) speech recognition model using calibration correction comprising:
an acoustic model including a first artificial neural network module using a speech information as input information and using a first text information corresponding to the speech information as output information,
a language model comprising a second artificial neural network module using the first text information as input information and outputting a second text information corresponding to the first text information as output information based on characteristics of the language model, and
a processor configured to generate a coupling probability distribution based on a first probability distribution information of an acoustic model output by the acoustic model and a second probability distribution information of a language model output by the language model, and generating the E2E speech recognition model based on the coupling probability distribution,
wherein the processor is further configured to generate the E2E speech recognition model based a corrected acoustic model and a corrected language model after each calibration is performed on the acoustic model and the language model.
|