| CPC G06V 30/133 (2022.01) [G06V 30/1912 (2022.01)] | 13 Claims |

|
1. A method of training a text quality assessment model, comprising:
determining a first text satisfying a condition of being a negative sample and a second text satisfying a condition of being a positive sample from a plurality of texts based on indicators for the plurality of texts;
for any text of the first text and the second text, adding a label to the text based on the condition satisfied by the text, wherein the label indicates a category of the text, and the category comprises a low-quality category for the negative sample and a non-low-quality category for the positive sample; and
constituting a training set by the first text having the label and the second text having the label, to train the text quality assessment model,
wherein the text quality assessment model comprises a semantic representation network and a fully connected layer, the semantic representation network is configured to extract a semantic feature, the fully connected layer is configured to map the semantic feature to a category-dimensional space and output a classification prediction result, and before training the text quality assessment model, the method further comprises:
training the semantic representation network based on the plurality of texts, to obtain a pre-trained semantic representation network; and
obtaining the text quality assessment model by splicing the fully connected layer in an output direction of the pre-trained semantic representation network,
wherein the method further comprises: after training the text quality assessment model,
retraining the trained text quality assessment model by using a target text as a training sample, wherein the target text has a manual annotation label indicating a true category of the text.
|