US 12,266,150 B2
Automatically generating training data sets for object recognition
Dehua Cui, Redmond, WA (US); Albert Thambiratnam, Redmond, WA (US); Ming Zhong, Redmond, WA (US); and Wenhui Zhang, Redmond, WA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Appl. No. 17/292,882
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
PCT Filed Dec. 12, 2018, PCT No. PCT/CN2018/120733
§ 371(c)(1), (2) Date May 11, 2021,
PCT Pub. No. WO2020/118584, PCT Pub. Date Jun. 18, 2020.
Prior Publication US 2021/0406595 A1, Dec. 30, 2021
Int. Cl. G06V 10/764 (2022.01); G06F 18/20 (2023.01); G06F 18/214 (2023.01); G06F 18/23 (2023.01); G06F 18/25 (2023.01); G06N 5/02 (2023.01); G06V 10/82 (2022.01); G06V 20/62 (2022.01); G06V 40/16 (2022.01)
CPC G06V 10/764 (2022.01) [G06F 18/214 (2023.01); G06F 18/23 (2023.01); G06F 18/251 (2023.01); G06F 18/29 (2023.01); G06N 5/02 (2013.01); G06V 10/82 (2022.01); G06V 20/62 (2022.01); G06V 40/16 (2022.01)] 17 Claims
OG exemplary drawing
 
1. A method for automatically generating a training data set for object recognition, comprising:
obtaining profile information for a plurality of objects; and
for each object from the plurality of objects:
collecting a group of initial images associated with the object based on an identity information of the object included in the profile information of the object;
filtering the group of initial images to obtain a group of filtered images associated with the object, wherein filtering the group of initial images further comprises, for each initial image:
calculating a first relevance score based on a similarity between the initial image and an image in the profile information of the object;
calculating a second relevance score based on a similarity between a description of the initial image and a description of the image in the profile information of the object;
determining that the initial image is a noisy image based on the first relevance score and the second relevance score; and
removing the initial image from the group of initial images in response to the determining that the initial image is a noisy image;
generating a group of training data pairs corresponding to the object by labeling each of the group of filtered images with the identity information of the object;
adding the group of training data pairs into the training data set; and
training an image recognition model based on the training data set, wherein the trained image recognition model is configured to perform image recognition for an input image.