| CPC G10L 15/16 (2013.01) [G06F 40/47 (2020.01); G10L 15/04 (2013.01); G10L 15/063 (2013.01); G10L 15/28 (2013.01)] | 20 Claims |

|
1. A computer-implemented method for training a machine translation model, the method comprising:
performing, by one or more processors of a computing system, automatic speech recognition on input source audio to generate a system transcript;
aligning, by the one or more processors, a human transcript of the source audio to the system transcript, including projecting system segmentation onto the human transcript;
performing, by the one or more processors, segment robustness training of the machine translation model according to the aligned human and system transcripts; and
performing, by the one or more processors, system robustness training of the machine translation model, including injecting token errors into training data.
|