US 11,798,278 B2
Method, apparatus, and storage medium for classifying multimedia resource
Yongyi Tang, Shenzhen (CN); Lin Ma, Shenzhen (CN); Wei Liu, Shenzhen (CN); and Lianqiang Zhou, Shenzhen (CN)
Assigned to Tencent Technology (Shenzhen) Company Limited, Shenzhen (CN)
Filed by TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed on Sep. 28, 2020, as Appl. No. 17/34,090.
Application 17/034,090 is a continuation of application No. PCT/CN2019/101298, filed on Aug. 19, 2019.
Claims priority of application No. 201811022608.9 (CN), filed on Sep. 3, 2018.
Prior Publication US 2021/0011942 A1, Jan. 14, 2021
Int. Cl. G06F 16/00 (2019.01); G06V 20/40 (2022.01); G06F 16/435 (2019.01); G06F 16/45 (2019.01); G06F 16/48 (2019.01); G06F 18/23 (2023.01); G06V 10/762 (2022.01); G06V 10/764 (2022.01); G06V 10/77 (2022.01); G06V 10/82 (2022.01)
CPC G06V 20/41 (2022.01) [G06F 16/435 (2019.01); G06F 16/45 (2019.01); G06F 16/48 (2019.01); G06F 18/23 (2023.01); G06V 10/763 (2022.01); G06V 10/764 (2022.01); G06V 10/7715 (2022.01); G06V 10/82 (2022.01)] 18 Claims
OG exemplary drawing
 
1. A method for classifying a multimedia resource, the method comprising:
obtaining, by a device comprising a memory storing instructions and a processor in communication with the memory, a multimedia resource;
inputting, by the device, the multimedia resource to a convolutional neural network model to extract a plurality of features of the multimedia resource;
executing, by the device, a machine learning model to perform non-local feature description on the multimedia resource by:
clustering the plurality of features to obtain at least one cluster set, and determining cluster description information of each cluster set, the each cluster set comprising at least one feature of the multimedia resource, and each piece of cluster description information being used for indicating a feature of one cluster set;
determining for each piece of cluster description information, first association information of the each piece of cluster description information, each piece of first association information being used for representing an association between the each piece of cluster description information and the remaining cluster description information,
obtaining at least one piece of first sub-association information of first cluster description information, each piece of first sub-association information being used for representing an association between the first cluster description information and one piece of second cluster description information, the first cluster description information being any one piece of cluster description information, and the second cluster description information being any one piece of information other than the first cluster description information in the at least one piece of cluster description information, wherein the obtaining at least one piece of first sub-association information of first cluster description information comprises applying a learnable parameter of the machine learning model;
obtaining first association information of the first cluster description information according to the at least one piece of first sub-association information and at least one piece of second cluster description information; and
determining at least one piece of target feature description information of the multimedia resource based on the cluster description information of each cluster set and the first association information, each piece of target feature description information being used for representing an association between one piece of cluster description information and the remaining cluster description information; and
classifying, by the device, the multimedia resource based on the at least one piece of target feature description information of the multimedia resource, to obtain a classification result of the multimedia resource.