| CPC G06N 3/08 (2013.01) [G06F 16/953 (2019.01); G06N 3/02 (2013.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)] | 20 Claims |

|
1. A method performed by one or more computers, the method comprising:
obtaining question data representing a question;
generating, from the question data, a search engine query for a search engine;
providing the search engine query to the search engine;
obtaining, in response to providing the search engine query to the search engine, a plurality of documents;
generating, from the plurality of documents, a plurality of conditioning inputs each representing at least a portion of one or more of the plurality of documents;
generating, using a language model neural network that has been trained on a language modeling objective for a language modeling task with training data comprising text, a plurality of network outputs that each represents a respective candidate answer to the question by, for each of a plurality of the conditioning inputs, processing a respective network input comprising a respective input sequence of text tokens generated from (i) the question data and (ii) the conditioning input using the language model neural network to generate a network output representing a candidate answer to the question and comprising an output sequence of text tokens, wherein the language model neural network auto-regressively generates the output sequence by generating each particular text token in the output sequence conditioned on a current input sequence that includes the respective input sequence and any text tokens that precede the particular text token in the output sequence;
generating, from the plurality of network outputs representing the candidate answers, answer data representing a final answer to the question; and
providing the answer data.
|