US 12,406,659 B2
Inverted projection for robust speech translation
Dirk Ryan Padfield, Seattle, WA (US); and Colin Andrew Cherry, Montreal (CA)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Jul. 7, 2022, as Appl. No. 17/859,146.
Claims priority of provisional application 63/224,902, filed on Jul. 23, 2021.
Prior Publication US 2023/0021824 A1, Jan. 26, 2023
Int. Cl. G06F 17/21 (2006.01); G06F 40/47 (2020.01); G10L 15/04 (2013.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/28 (2013.01)
CPC G10L 15/16 (2013.01) [G06F 40/47 (2020.01); G10L 15/04 (2013.01); G10L 15/063 (2013.01); G10L 15/28 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for training a machine translation model, the method comprising:
performing, by one or more processors of a computing system, automatic speech recognition on input source audio to generate a system transcript;
aligning, by the one or more processors, a human transcript of the source audio to the system transcript, including projecting system segmentation onto the human transcript;
performing, by the one or more processors, segment robustness training of the machine translation model according to the aligned human and system transcripts; and
performing, by the one or more processors, system robustness training of the machine translation model, including injecting token errors into training data.