US 11,704,487 B2
System and method for fashion attributes extraction
Shanglin Yang, Sunnyvale, CA (US); and Hui Zhou, Sunnyvale, CA (US)
Assigned to BEIJING JINGDONG SHANGKE INFORMATION TECHNOLOGY CO., LTD., Beijing (CN); and JD.COM AMERICAN TECHNOLOGIES CORPORATION, Mountain View, CA (US)
Filed by Beijing Jingdong Shangke Information Technology Co., Ltd., Beijing (CN); and JD.com American Technologies Corporation, Mountain View, CA (US)
Filed on Apr. 4, 2019, as Appl. No. 16/375,308.
Prior Publication US 2020/0320348 A1, Oct. 8, 2020
Int. Cl. G06F 40/30 (2020.01); G06F 40/216 (2020.01); G06N 5/04 (2023.01); G06F 18/214 (2023.01); G06V 30/18 (2022.01); G06V 30/19 (2022.01); G06V 10/82 (2022.01); G06V 20/00 (2022.01); G06V 20/62 (2022.01); G06N 3/08 (2023.01); G06V 30/10 (2022.01)
CPC G06F 40/216 (2020.01) [G06F 18/2155 (2023.01); G06F 40/30 (2020.01); G06N 5/04 (2013.01); G06V 10/82 (2022.01); G06V 20/00 (2022.01); G06V 20/62 (2022.01); G06V 30/18057 (2022.01); G06V 30/19173 (2022.01); G06N 3/08 (2013.01); G06V 30/10 (2022.01)] 17 Claims
 
1. A method for training an inference model using a computing device, comprising:
providing a text-to-vector converter;
providing the inference model and pre-training the inference model using a first number of labeled fashion entries;
providing a second number of fashion entries, wherein the fashion entries are not labeled;
separating each of the second number of fashion entries into a target image and target text;
converting the target text into a category vector and an attribute vector using the text-to-vector converter, wherein the category vector comprises a plurality of dimensions corresponding to categories of fashion, and the attribute vector comprise a plurality of dimensions corresponding to attributes of fashion, wherein the converting the target text into the category vector and the attribute vector comprises:
providing a category name list and an attribute name list, wherein the category name list comprises a word list of categories of fashion, and the attribute name list comprises a word list of attributes of fashion;
initializing the category vector and the attribute vector;
splitting the target text to obtain target words;
comparing each of the target words to the category name list and the attribute name list to obtain a similarity score; and
updating the category vector or the attribute vector when the similarity score is greater than a threshold;
processing the target image using the inference model to obtain processed target image and target image label;
comparing the category vector to the target image label;
when the category vector matches the target image label, updating the target image label based on the category vector and the attribute vector to obtain updated label; and
retraining the inference model using the processed target image and the updated label.