US 11,900,069 B2
Translation model training method, sentence translation method, device, and storage medium
Yong Cheng, Shenzhen (CN); Zhaopeng Tu, Shenzhen (CN); Fandong Meng, Shenzhen (CN); Junjie Zhai, Shenzhen (CN); and Yang Liu, Shenzhen (CN)
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed by TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed on Aug. 7, 2020, as Appl. No. 16/987,565.
Application 16/987,565 is a continuation of application No. PCT/CN2019/080411, filed on Mar. 29, 2019.
Claims priority of application No. 201810445783.2 (CN), filed on May 10, 2018.
Prior Publication US 2020/0364412 A1, Nov. 19, 2020
Int. Cl. G06F 40/44 (2020.01); G06N 3/08 (2023.01); G06F 40/40 (2020.01); G06F 40/30 (2020.01); G06F 9/30 (2018.01); G06F 18/214 (2023.01)
CPC G06F 40/44 (2020.01) [G06F 9/30196 (2013.01); G06F 18/214 (2023.01); G06F 40/30 (2020.01); G06F 40/40 (2020.01); G06N 3/08 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A translation model training method for a computer device, comprising:
obtaining a training sample set, the training sample set including a plurality of training samples, wherein each training sample is a training sample pair having a training input sample in a first language and a training output sample in a second language;
determining a disturbance sample set corresponding to each training sample in the training sample set, the disturbance sample set comprising at least one disturbance sample, and a semantic similarity between the disturbance sample and the corresponding training sample being greater than a first preset value, wherein the disturbance sample set includes: a disturbance input sample set corresponding to each training input sample, and a disturbance output sample which is the same as the training output sample; and
training an initial translation model by using the plurality of training samples and the disturbance sample set corresponding to each training sample to obtain a target translation model, wherein the initial translation model comprises:
an encoder configured to receive the training input sample from the training sample set and a corresponding disturbance input sample from the disturbance sample set, and output a first intermediate expressed result and a second intermediate expressed result, the first intermediate expressed result being an intermediate expressed result of the training input sample, and the second intermediate expressed result being an intermediate expressed result of the corresponding disturbance input sample;
a classifier is configured to distinguish the first intermediate expressed result from the second intermediate expressed result, and
a decoder is configured to output the training output sample according to the first intermediate expressed result and output the training output sample according to the second intermediate expressed result.