CPC G06N 3/08 (2013.01) [G06F 16/953 (2019.01); G06N 20/00 (2019.01)] | 20 Claims |
1. A method performed by one or more computers, the method comprising:
obtaining question data representing a question from a user;
generating, from the question data, a search engine query for an Internet search engine;
providing the search engine query to the Internet search engine;
receiving a set of search results from the Internet search engine, wherein each search result identifies a respective document;
identifying a plurality of documents from the respective documents identified by the search results;
generating, from the plurality of documents, a plurality of conditioning inputs each representing at least a portion of one or more of the identified documents;
generating, using a language model neural network that has been trained on a language modeling objective for a language modeling task with training data comprising text, a plurality of network outputs that each represents a respective candidate answer to the question by, for each of a plurality of the conditioning inputs, processing a respective network input comprising a respective input sequence of text tokens generated from (i) the question data and (ii) the conditioning input using the language model neural network to generate a network output representing a candidate answer to the question and comprising an output sequence of text tokens, wherein the language model neural network auto-regressively generates the output sequence by generating each particular text token in the output sequence conditioned on a current input sequence that includes the respective input sequence and any text tokens that precede the particular text token in the output sequence;
generating, from the plurality of network outputs representing the respective candidate answers, answer data representing a final answer to the question; and
providing the answer data to the user.
|