CPC G06F 40/35 (2020.01) [G06F 40/117 (2020.01); G06F 40/295 (2020.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01); H04L 51/02 (2013.01)] | 20 Claims |
1. A system, comprising:
a processor; and
a memory device that stores program code configured to be executed by the processor, the program code comprising:
an embedding generator configured to:
receive, via a user interface, a first sentence, an identification of a first named entity in the first sentence, and an entity type associated with the first named entity,
mask the first named entity identified via the user interface in the first sentence to generate a masked first sentence,
generate an embedding set that comprises a plurality of sentence embeddings generated from a plurality of tagged sentences for the entity type, the plurality of sentence embeddings including one or more sentence embeddings for at least part of the masked first sentence,
extract a candidate entity value from a second sentence received by a virtual agent,
mask the candidate entity value in the second sentence to generate a masked second sentence, and
generate a plurality of candidate embeddings for at least part of the masked second sentence, the plurality of candidate embeddings comprising:
a first candidate embedding for a first subset of terms of the masked second sentence that follow the masked candidate entity value in a forward order, and
a second candidate embedding for a second subset of terms of the masked second sentence that precede the masked candidate entity value in a reverse order;
an embedding comparer configured to:
compare each of the plurality of sentence embeddings in the embedding set with each of the plurality of candidate embeddings, and
assign a match score to each comparison to generate a match score set; and
an entity value extractor configured to:
identify a match score of the match score set that exceeds a similarity threshold, and
extract an entity value of the entity type from the second sentence associated with the identified match score.
|