CPC G06F 40/284 (2020.01) [G06F 16/9538 (2019.01); G06F 40/295 (2020.01)] | 20 Claims |
1. A system comprising:
a processor; and
a memory comprising computer-readable instructions, the memory and the computer-readable instructions configured to, with the processor, implement a pre-trained interpreting text-based similarity (ITBS) model, to cause the processor to:
calculate a set of gradients representing a first unlabeled text-based paragraph describing a seed item and a second unlabeled text-based paragraph describing a recommended item predicted to be similar to the seed item, the set of gradients calculated with respect to a cosine similarity function applied on a set of feature vectors, the set of feature vectors comprising a first feature vector representing the first unlabeled text-based paragraph and a second feature vector representing the second unlabeled text-based paragraph, wherein the first unlabeled text-based paragraph and the second unlabeled text-based paragraph comprise an unlabeled paragraph pair;
generate contextualized embeddings based on the set of gradients and a similarity score measuring an affinity between the first unlabeled text-based paragraph and the second unlabeled text-based paragraph, wherein generating the contextualized embeddings includes:
tokenizing the first unlabeled text-based paragraph and the second unlabeled text-based paragraph;
generating a saliency score for each token in the first unlabeled text-based paragraph and for each token in the second unlabeled text-based paragraph, wherein the saliency score is associated with at least one word in an item description;
aggregating the token saliency scores of the first unlabeled text-based paragraph to generate word-scores for the first unlabeled text-based paragraph;
aggregating the token saliency scores of the second unlabeled text-based paragraph to generate word-scores for the second unlabeled text-based paragraph;
matching words from the first unlabeled text-based paragraph and the second unlabeled text-based paragraph based on the similarity score to generate a set of word-pairs, each word-pair in the set of word-pairs comprising a first word selected from the first unlabeled text-based paragraph matched to a second word selected from the second unlabeled text-based paragraph, wherein the first word and the second word have a similar semantic meaning; and
scoring each word-pair using the generated word-scores of the aggregated token saliency scores for both the first unlabeled text-based paragraph and the second unlabeled text-based paragraph to generate a word-pair score, the word-pair score indicating a degree of influence exerted by an individual word-pair on selection of the recommended item from a plurality of candidate items;
select a word-pair from the set of word-pairs based on the word-pair score and a threshold value; and
interpret, based on the selected word-pair, a recommendation generated by a recommendation model.
|