| CPC G06V 10/762 (2022.01) [G06F 16/285 (2019.01); G06V 10/761 (2022.01); G06V 10/764 (2022.01)] | 17 Claims |

|
1. A data labeling method based on artificial intelligence, comprising:
determining a plurality of samples involved in clustering;
performing a plurality of following operations circularly to realize iterative processing, until a convergence condition is satisfied, or a quantity of iterations reaches a number threshold, comprising:
pre-clustering the plurality of samples involved in clustering, according to a vector representation of the respective samples involved in clustering, to obtain a plurality of class clusters, wherein each class cluster contains at least one sample involved in clustering;
receiving labeling information for the respective class clusters, wherein the labeling information for the respective class clusters comprises: at least one sub-cluster contained in the respective class clusters, and a representative sample in each sub-cluster, wherein the sub-cluster comprises one representative sample and at least one non-representative sample;
re-determining the plurality of samples involved in clustering, according to the labeling information by: taking the representative sample in the sub-cluster in the labeling information for the respective class clusters, as the re-determined plurality of samples involved in clustering;
for the representative sample, determining a non-representative sample that belongs to, in a previous iteration process, a same sub-cluster as the representative sample; and
determining a sub-cluster to which the non-representative sample belongs in a current iteration process, to be the same as a sub-cluster to which the representative sample belongs in the current iteration process; and
determining a clustering result according to the labeling information for the respective class clusters.
|