US 12,008,473 B2
	Augmenting machine learning language models using search engine results
Angeliki Lazaridou, London (GB); Elena Gribovskaya, London (GB); Nikolai Grigorev, London (GB); and Wojciech Jan Stokowiec, London (GB)
Assigned to DeepMind Technologies Limited, London (GB)
Filed by DeepMind Technologies Limited, London (GB)
Filed on Jan. 31, 2023, as Appl. No. 18/104,210.
Claims priority of application No. 20220100089 (GR), filed on Jan. 31, 2022.
Prior Publication US 2023/0244934 A1, Aug. 3, 2023
Int. Cl. G06N 3/08 (2023.01); G06F 16/953 (2019.01); G06N 20/00 (2019.01)

CPC G06N 3/08 (2013.01) [G06F 16/953 (2019.01); G06N 20/00 (2019.01)]

20 Claims

1. A method performed by one or more computers, the method comprising:

obtaining question data representing a question from a user;

generating, from the question data, a search engine query for an Internet search engine;

providing the search engine query to the Internet search engine;

receiving a set of search results from the Internet search engine, wherein each search result identifies a respective document;

identifying a plurality of documents from the respective documents identified by the search results;

generating, from the plurality of documents, a plurality of conditioning inputs each representing at least a portion of one or more of the identified documents;

generating, using a language model neural network that has been trained on a language modeling objective for a language modeling task with training data comprising text, a plurality of network outputs that each represents a respective candidate answer to the question by, for each of a plurality of the conditioning inputs, processing a respective network input comprising a respective input sequence of text tokens generated from (i) the question data and (ii) the conditioning input using the language model neural network to generate a network output representing a candidate answer to the question and comprising an output sequence of text tokens, wherein the language model neural network auto-regressively generates the output sequence by generating each particular text token in the output sequence conditioned on a current input sequence that includes the respective input sequence and any text tokens that precede the particular text token in the output sequence;

generating, from the plurality of network outputs representing the respective candidate answers, answer data representing a final answer to the question; and

providing the answer data to the user.