| CPC G06Q 10/063114 (2013.01) [G06N 3/09 (2023.01)] | 20 Claims |

|
1. A method, comprising:
tokenizing a plurality of customer service tickets to generate a plurality of tokenized customer service tickets, wherein the plurality of tokenized customer service tickets comprise textual tokens;
obtaining encodings of the plurality of tokenized customer service tickets in a vector space using an encoding model, wherein the encoding model is trained in a first training stage to learn, using at least one machine learning model, an encoding function from a first set of training data, wherein the encodings of the plurality of tokenized customer service tickets are generated using a self-supervised learning algorithm;
determining, using at least one processing device, pairwise similarities for at least a subset of the encodings of the plurality of tokenized customer service tickets;
obtaining feedback from a user regarding at least some of the pairwise similarities for the subset of the encodings;
updating, using the at least one processing device, one or more of the pairwise similarities for the subset of the encodings of the plurality of tokenized customer service tickets, using at least some of the feedback from the user, to create a second set of training data;
generating, using the at least one processing device and the second set of training data, an updated encoding model in a second training stage by processing the updated pairwise similarities for the subset of the encodings of the plurality of tokenized customer service tickets using a supervised learning algorithm, wherein the generating the updated encoding model comprises, for a given training epoch of a plurality of training epochs, obtaining a batch of tokenized customer service tickets from the plurality of tokenized customer service tickets; transforming the batch of tokenized customer service tickets into encodings using a current encoding model; determining pairwise similarities for the encodings of the batch of tokenized customer service tickets; generating a first aggregate similarity value obtained from the pairwise similarities for the encodings of the batch of tokenized customer service tickets; generating a second aggregate similarity value obtained from the pairwise similarities for the encodings of the corresponding tokenized customer service tickets in the plurality of tokenized customer service tickets; and evaluating a loss function using the first aggregate similarity value and the second aggregate similarity value and applying a supervised learning algorithm to fit the updated encoding model with respect to the loss function; and
processing, using the at least one processing device, at least one tokenized customer service ticket based at least in part on the updated encoding model;
wherein the at least one processing device comprises a processor coupled to a memory.
|