CPC G10L 15/26 (2013.01) [G10L 15/1822 (2013.01)] | 20 Claims |
1. A computer-implemented method, comprising:
receiving first input data representing speech;
processing the first input data using a spoken language understanding (SLU) component, the SLU component configured:
to perform an audio-to-text processing task, and
to perform an audio-to-semantic meaning data task,
wherein the SLU component is trained using a first training dataset including masked automatic speech recognition (ASR) data comprising first masked ASR data corresponding to a first spoken input, wherein at least one word of the first spoken input is masked in the first ASR data;
determining, based on processing the first input data using the SLU component, first data representing a semantic meaning corresponding to the first input data; and
determining, using the first data, first output data responsive to the first input data.
|