US 12,443,806 B2
	Machine translation using neural network models
Zhifeng Chen, Sunnyvale, CA (US); Macduff Richard Hughes, Los Gatos, CA (US); Yonghui Wu, Fremont, CA (US); Michael Schuster, Saratoga, CA (US); Xu Chen, San Francisco, CA (US); Llion Owen Jones, San Francisco, CA (US); Niki J. Parmar, Sunnyvale, CA (US); George Foster, Ottawa (CA); Orhan Firat, Mountain View, CA (US); Ankur Bapna, Sunnyvale, CA (US); Wolfgang Macherey, Sunnyvale, CA (US); and Melvin Jose Johnson Premkumar, Sunnyvale, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Sep. 28, 2023, as Appl. No. 18/374,071.
Application 18/374,071 is a continuation of application No. 17/459,041, filed on Aug. 27, 2021, granted, now 11,809,834.
Application 17/459,041 is a continuation of application No. 16/521,780, filed on Jul. 25, 2019, granted, now 11,138,392, issued on Oct. 5, 2021.
Claims priority of provisional application 62/703,518, filed on Jul. 26, 2018.
Prior Publication US 2024/0020491 A1, Jan. 18, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 40/58 (2020.01); G06N 3/08 (2023.01)

CPC G06F 40/58 (2020.01) [G06N 3/08 (2013.01)]

18 Claims

1. A computer-implemented method for performing machine translation of text from a first language to a second language, the method comprising:

generating, by one or more processors, a set of encoding vectors from feature vectors representing characteristics of a text segment in the first language, by:

processing one or more first portions of the feature vectors with a first neural network and a first neural network topology; and

processing one or more second portions of the feature vectors with the first neural network and a second neural network topology that is different from the first neural network topology, each encoding vector of the set having a predetermined number of values;

generating, by the one or more processors, context vectors for different subsets of each encoding vector based on a group of parameters;

generating, by the one or more processors, a sequence of output vectors using a second neural network that receives the context vectors, the sequence of output vectors representing distributions over language elements of the second language; and

determining, by the one or more processors, a translation of the text segment into the second language based on the sequence of output vectors.