US 12,450,873 B2
Data classification and recognition method and apparatus, device, and medium
Dong Wei, Shenzhen (CN); Jinghan Sun, Shenzhen (CN); Kai Ma, Shenzhen (CN); Liansheng Wang, Shenzhen (CN); and Yefeng Zheng, Shenzhen (CN)
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed by TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed on Dec. 8, 2022, as Appl. No. 18/077,709.
Application 18/077,709 is a continuation of application No. PCT/CN2022/090902, filed on May 5, 2022.
Claims priority of application No. 202110532246.3 (CN), filed on May 17, 2021.
Prior Publication US 2023/0105590 A1, Apr. 6, 2023
Int. Cl. G06V 10/764 (2022.01); G06F 18/213 (2023.01); G06F 18/241 (2023.01); G06F 18/25 (2023.01); G06N 3/088 (2023.01); G06N 3/0895 (2023.01); G06N 3/09 (2023.01); G06N 20/00 (2019.01); G06N 20/20 (2019.01); G06V 10/40 (2022.01); G06V 10/70 (2022.01); G06V 10/77 (2022.01); G06V 10/774 (2022.01); G06V 10/80 (2022.01); G06V 10/94 (2022.01); G06V 30/18 (2022.01); G06V 30/19 (2022.01); A61B 5/00 (2006.01); G06F 18/21 (2023.01); G06F 18/214 (2023.01); G06T 5/60 (2024.01); G06V 10/778 (2022.01); G10L 15/02 (2006.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 25/30 (2013.01)
CPC G06V 10/764 (2022.01) [G06F 18/213 (2023.01); G06F 18/241 (2023.01); G06F 18/25 (2023.01); G06N 3/088 (2013.01); G06N 3/0895 (2023.01); G06N 3/09 (2023.01); G06N 20/00 (2019.01); G06N 20/20 (2019.01); G06V 10/40 (2022.01); G06V 10/70 (2022.01); G06V 10/7715 (2022.01); G06V 10/7753 (2022.01); G06V 10/803 (2022.01); G06V 10/806 (2022.01); G06V 10/809 (2022.01); G06V 10/95 (2022.01); G06V 30/18 (2022.01); G06V 30/19173 (2022.01); A61B 5/7264 (2013.01); A61B 5/7267 (2013.01); G06F 18/2155 (2023.01); G06F 18/2178 (2023.01); G06T 5/60 (2024.01); G06T 2207/20081 (2013.01); G06V 10/7784 (2022.01); G06V 10/7788 (2022.01); G10L 15/02 (2013.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 25/30 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A data classification and recognition method, applied to a computer device, the method comprising:
obtaining a first data set and a second data set, the first data set comprising first data, the first data being unlabeled data, the second data set comprising second data, wherein a sample in the second data is labeled, a data amount of the first data in the first data set is greater than a data amount of the second data in the second data set;
performing unsupervised training on a feature extraction network in a candidate classification model based on the first data;
combining a classification regression network in the candidate classification model and the feature extraction network after the unsupervised training to obtain a base classification model, the classification regression network being configured to perform data classification in a target class set;
performing supervised training on the base classification model by using the second data and corresponding sample labels of the second data set to obtain a first classification model;
obtaining a second classification model, the second classification model being a classification model with a model parameter to be adjusted;
adjusting the model parameter of the second classification model by using a first prediction result of the first data predicted by the first classification model as a reference and based on a second prediction result of the first data predicted by the second classification model, to obtain a data classification model, the first prediction result being utilized as a pseudo-label; and
performing class prediction on target data by using the data classification model to obtain a classification result of the target data.