| CPC G06F 16/9535 (2019.01) [G06F 16/951 (2019.01); G06F 16/9536 (2019.01)] | 14 Claims |

|
1. A method, comprising:
receiving, by a processing device from a user device, first data associated with a first user profile, the first data comprising a first set of ranked items and a first title having one or more keywords comprising a first topic keyword and a first sentiment keyword;
identifying, by the processing device, second data associated with a second user profile, the second data comprising a second set of ranked items and a second title having one or more keywords comprising a second topic keyword and a second sentiment keyword;
determining that the first title and the second title are within a threshold proximity based on a comparison between the first topic keyword and the second topic keyword, and further based on a comparison between the first sentiment keyword and the second sentiment keyword, wherein the comparison between the first topic keyword and the second topic keyword comprises:
providing the first topic keyword and second topic keyword as inputs to a machine learning model, and
obtaining one or more outputs of the machine learning model indicating that the first topic keyword and the second topic keyword both correspond to the same topic cluster;
training, by the processing device, a second machine learning model for determining an overall similarity between a plurality of sets of ranked items, wherein the training uses a training dataset comprising similarity data of a plurality of training sets of ranked items, wherein the similarity data comprises a plurality of similarity metrics for respective pairs of training sets of the plurality of training sets of ranked items, and wherein the training further uses a training loss function that indicates an error based on user input identifying which training sets of ranked items are to be identified as the most similar;
determining, by the processing device, that the first set of ranked items is most similar to the second set of ranked items based on an application of the second machine learning model to the first set of ranked items and one or more additional sets of ranked items comprising the second set of ranked items, wherein the application of the second machine learning model comprises:
providing the first set of ranked items and the one or more additional sets of ranked items as inputs to the second machine learning model, and
obtaining one or more outputs of the second machine learning model indicating that the first set of ranked items is most similar to the second set of ranked items; and
responsive to determining that the first set of ranked items is most similar to the second set of ranked items, providing, by the processing device, the first set of ranked items for presentation on the user device.
|