US 11,966,711 B2
Translation verification and correction
Guang Ming Zhang, Beijing (CN); Xiaoyang Yang, San Francisco, CA (US); Hong Wei Jia, Beijing (CN); Mo Chi Liu, Beijing (CN); and Yun Wang, Beijing (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on May 18, 2021, as Appl. No. 17/323,270.
Prior Publication US 2022/0374614 A1, Nov. 24, 2022
Int. Cl. G06F 40/58 (2020.01); G06F 40/51 (2020.01); G06N 3/08 (2023.01)
CPC G06F 40/58 (2020.01) [G06F 40/51 (2020.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
obtaining, by one or more processors, a plurality of groups of training data, wherein each of the plurality of groups of training data comprises a plurality of words in a source language and translations of the plurality of words in a target language;
generating, by one or more processors, a plurality of data sets from the plurality of groups of training data, wherein the plurality of data sets comprises a first data set in the source language, a second data set in the source and target languages and a third data set in the target language;
training, by one or more processors, a neural network based on the plurality of data sets for determining an association degree among a group of words in the source or target language;
obtaining, by one or more processors, a group of words in the source language and translations of the group of words in the target language;
converting, by one or more processors, the group of words in the source language into a first word vector and inputting the first word vector to the trained neural network to generate a first vector indicating an association degree among the group of words in the source language;
converting, by one or more processors, the translations of the group of words in the target language into a second word vector and inputting the second word vector to the trained neural network to generate a second vector indicating an association degree among the translations of the group of words in the target language;
using the first vector and the second vector, determining, by one or more processors, a distance between the group of words in the source language and the translations of the group of words in the target language, wherein the distance is equal to a square of a difference between the first vector and the second vector;
in response to the distance between the group of words in the source language and the translations of the group of words in the target language exceeding a predetermined threshold, determining, by one or more processors, there is an error in the translations of the group of words; and
in response to determining that an error occurs in the translations of the group of words, generating, by one or more processors, a correct translation from a plurality of candidate translations determined using the trained neural network.