CPC H04L 51/02 (2013.01) [G06F 40/295 (2020.01); G06N 5/022 (2013.01)] | 20 Claims |
1. A computer-implemented method comprising:
receiving a natural-language sequence of words comprising multiple entities, wherein the entities comprise persons and places;
identifying a plurality of entities in the natural-language sequence;
generating a masked natural-language sequence by masking a first entity in the natural-language sequence;
generating a first representation of the masked natural-language sequence, wherein the first representation is a machine embedding of the masked natural-language sequence;
retrieving, from a knowledge base, information related to a second entity in the plurality of entities;
generating a second representation of the information; and
training a natural-language model to respond to a query, wherein the training uses the first representation of the masked natural-language sequence and the second representation of the information as inputs and the first entity as a training label.
|