US 12,333,436 B2
Augmenting machine learning language models using search engine results
Angeliki Lazaridou, London (GB); Elena Gribovskaya, London (GB); Nikolai Grigorev, London (GB); and Wojciech Jan Stokowiec, London (GB)
Assigned to DeepMind Technologies Limited, London (GB)
Filed by DeepMind Technologies Limited, London (GB)
Filed on Apr. 30, 2024, as Appl. No. 18/651,384.
Application 18/651,384 is a continuation of application No. 18/104,210, filed on Jan. 31, 2023, granted, now 12,008,473.
Claims priority of application No. 0220100089 (GR), filed on Jan. 31, 2022.
Prior Publication US 2024/0281659 A1, Aug. 22, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/08 (2023.01); G06F 16/953 (2019.01); G06N 3/02 (2006.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06N 3/08 (2013.01) [G06F 16/953 (2019.01); G06N 3/02 (2013.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method performed by one or more computers, the method comprising:
obtaining question data representing a question;
generating, from the question data, a search engine query for a search engine;
providing the search engine query to the search engine;
obtaining, in response to providing the search engine query to the search engine, a plurality of documents;
generating, from the plurality of documents, a plurality of conditioning inputs each representing at least a portion of one or more of the plurality of documents;
generating, using a language model neural network that has been trained on a language modeling objective for a language modeling task with training data comprising text, a plurality of network outputs that each represents a respective candidate answer to the question by, for each of a plurality of the conditioning inputs, processing a respective network input comprising a respective input sequence of text tokens generated from (i) the question data and (ii) the conditioning input using the language model neural network to generate a network output representing a candidate answer to the question and comprising an output sequence of text tokens, wherein the language model neural network auto-regressively generates the output sequence by generating each particular text token in the output sequence conditioned on a current input sequence that includes the respective input sequence and any text tokens that precede the particular text token in the output sequence;
generating, from the plurality of network outputs representing the candidate answers, answer data representing a final answer to the question; and
providing the answer data.