US 12,260,618 B2
	Method and system for fashion attribute detection
Jayavardhana Rama Gubbi Lakshminarasimha, Bangalore (IN); Gaurab Bhattacharya, Bangalore (IN); Nikhil Kilari, Bangalore (IN); Bagyalakshmi Vasudevan, Chennai (IN); and Balamuralidhar Purushothaman, Bangalore (IN)
Assigned to Tata Consultancy Services Limited, Mumbai (IN)
Filed by Tata Consultancy Services Limited, Mumbai (IN)
Filed on Jul. 1, 2022, as Appl. No. 17/810,468.
Claims priority of application No. 202121031998 (IN), filed on Jul. 15, 2021.
Prior Publication US 2023/0069442 A1, Mar. 2, 2023
Int. Cl. G06V 10/82 (2022.01); G06V 10/77 (2022.01); G06V 10/774 (2022.01); G06V 10/776 (2022.01)

CPC G06V 10/7715 (2022.01) [G06V 10/7747 (2022.01); G06V 10/776 (2022.01); G06V 10/82 (2022.01)]

9 Claims

1. A processor implemented method of fashion feature extraction, comprising:

collecting at least one image as input, via one or more hardware processors; and

processing the at least one image using a feature extraction network comprising a plurality of Attentive Multi-scale Feature (AMF) blocks implemented the via one or more hardware processors, using a data model, wherein processing the at least one image by the plurality of AMF blocks comprising:

extracting a plurality of features from the at least one image, by a first subnetwork of the AMF blocks, wherein the first subnetwork enables extraction of coarse features in parallel manner to aggregate different representations from low-level features for fine-grained image analysis;

identifying and extracting features belonging to different scales, from among the plurality of features extracted from the at least one image, by a second subnetwork of the AMF blocks, wherein the second subnetwork applies a convolution operation on the plurality of features, wherein extracting the plurality of features from the at least one image comprises concatenating a plurality of feature representations obtained from the at least one image by applying the convolution operation on the at least one image;

assigning a unique weightage to each of a plurality of channels used for the convolution operation, based on a determined importance of each of the features belonging to the different scales, by a third subnetwork of the AMF blocks for adaptive channel calibration;

determining a rank for each of the extracted features belonging to the different scales, based on the unique weightage of corresponding channel, by the third subnetwork; and

generating one or more recommendations of the extracted features based on the determined rank of each of the extracted features; and

verifying accuracy of the generated one or more recommendations of the extracted features using a γ-variant focal loss function, wherein the γ-variant focal loss function is used to train a data model for attribute extraction for addressing class imbalance by penalizing wrongly classified examples and incorporating importance to positive and negative instances, wherein the γ-variant focal loss function is provided by:

wherein, y_tand y_pdenote ground-truth labels and predicted labels, hyper-parameters γ₁and γ₂enable the γ-variant focal loss function to adaptively focus on false positive and false negative hard examples by increasing corresponding cost in the loss function, wherein λ deals with providing different weights to the positive and negative instances, wherein γ₁and γ₂are used by the γ-variant focal loss function to separately optimize the attribute extraction network by reducing all the false instances depending on their probability of occurrence for true and false instances.