US 12,086,715 B2
Generating neural network outputs using insertion commands
William Chan, Toronto (CA); Mitchell Thomas Stern, Berkeley, CA (US); Nikita Kitaev, Berkeley, CA (US); Kelvin Gu, Mountain View, CA (US); and Jakob D. Uszkoreit, Berlin (DE)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on May 22, 2023, as Appl. No. 18/321,696.
Application 18/321,696 is a continuation of application No. 16/883,772, filed on May 26, 2020, granted, now 11,657,277.
Claims priority of provisional application 62/852,301, filed on May 23, 2019.
Prior Publication US 2024/0028893 A1, Jan. 25, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 40/30 (2020.01); G06F 40/237 (2020.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01); G06N 3/084 (2023.01)
CPC G06N 3/08 (2013.01) [G06F 40/237 (2020.01); G06N 3/04 (2013.01); G06N 3/084 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A method performed by one or more computers, the method comprising:
receiving a system input that includes one or more source elements from a source sequence and zero or more target elements from a target sequence, wherein each source element in the source sequence is selected from a vocabulary of source elements and wherein each target element in the target sequence is selected from a vocabulary of target elements;
generating a partial concatenated sequence that includes the one or more source elements from the source sequence and the zero or more target elements from the target sequence, wherein the source and target elements are arranged in the partial concatenated sequence according to a combined order; and
generating a final concatenated sequence that includes a finalized source sequence and a finalized target sequence, wherein the finalized target sequence includes one or more target elements, and wherein the generating comprises, at each of a plurality of generation time steps:
generating, using a sequence modeling neural network conditioned on the partial concatenated sequence, a network output that defines, for each of a plurality of insertion locations, a respective score distribution over a combined vocabulary that includes source elements and target elements, wherein each insertion location is a different new location in the combined order at which there is no element in the partial concatenated sequence;
selecting, using the network output, one or more of the insertion locations and, for each selected insertion location, a first element from the combined vocabulary; and
updating the partial concatenated sequence to include, for each selected insertion location, the first element selected for the selected insertion location inserted at the corresponding new location in the combined order.