| CPC G06Q 30/0631 (2013.01) [G06F 16/245 (2019.01); G06N 20/00 (2019.01)] | 14 Claims |

|
1. A method for training a user-clicking-item task model, comprising:
collecting a plurality of pieces of training data, each piece of the training data including at least two pieces of training feature information of a training user, at least two pieces of training feature information of a training item, and click interaction information of the training user clicking the training item; and
performing joint training on an embedding processing module, a recall task module and a click rate estimation task module in the user-clicking-item task model, by using the plurality of pieces of training data, comprising:
for each piece of the training data, obtaining a combined feature expression of the training user by splicing the at least two pieces of feature information of the training user after performing embedding expression on them by the embedding processing module;
obtaining a combined feature expression of the training item by splicing the at least two pieces of training feature information of the training item after performing embedding expression on them by the embedding processing module;
in the recall task module, obtaining the feature expression of the training user and the feature expression of the training item by processing the combined feature expression of the training user and the combined feature expression of the training item respectively through at least two processing layers including layers for full connection processing and activation processing;
in the recall task module, obtaining a recommendation degree index of the training item to the training user by multiplying the feature expression of the training user with the feature expression of the training item;
in the click rate estimation task module, obtaining a predicted click probability of the training user clicking the training item by processing in turn through at least two processing layers including layers for full connection processing and activation processing and through a sigmoid activation function processing layer, after splicing the combined feature expression of the training user and the combined feature expression of the training item;
generating a comprehensive cross-entropy loss function according to the recommendation degree index of the training item to the training user, the predicted click probability of the training user clicking the training item, and the known click interaction information of the training user clicking the training item;
judging whether the comprehensive cross-entropy loss function converges; adjusting parameters of the embedding processing module, the recall task module and the click rate estimation task module in the user-clicking-item task model in response to determining that the comprehensive cross-entropy loss function does not converge, so that embedding parameters in the embedding processing module are affected through back-propagation and information of interaction between the feature expression of the training user and the feature expression of the training item is recorded in the embedding parameters; and
in response to determining that the comprehensive cross-entropy loss function converges, and in response to determining that the comprehensive cross-entropy function converges in all of first continuous preset rounds of training, determining parameters of the user-clicking-item task model, and thereby determining the user-clicking-item task model.
|