US 11,961,281 B1
Training and using computer vision model for item segmentations in images
Taewan Kim, Seattle, WA (US); Jesse Norman Clark, Ascot Vale (AU); and Onkar Jayant Dabeer, Sammamish, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Dec. 8, 2021, as Appl. No. 17/545,119.
Int. Cl. G06V 10/774 (2022.01); B25J 9/16 (2006.01); G06V 10/75 (2022.01); G06V 10/764 (2022.01); G06V 20/64 (2022.01)
CPC G06V 10/774 (2022.01) [B25J 9/163 (2013.01); B25J 9/1697 (2013.01); G06V 10/75 (2022.01); G06V 10/764 (2022.01); G06V 20/64 (2022.01)] 20 Claims
OG exemplary drawing
 
1. One or more computer-readable storage media storing instructions that, upon execution on a system, configure the system to perform operations comprising:
training, by at least using labeled data, a first machine learning model, the labeled data indicating a first mask associated with a first object present in a first training image, the first object associated with an item package classification;
generating, by at least using a second training image as input to the first machine learning model, a first pseudo-label indicating a second mask associated with a second object detected by the first machine learning model in the second training image, the second object associated with the item package classification;
generating, by at least using a transformation, a transformed image of the second training image;
determining, based at least in part on the transformation, a second pseudo-label indicating a third mask that is detected in the transformed image and that corresponds to the second mask;
training, by at least using the labeled data and the second pseudo-label, a second machine learning model;
receiving a third image that shows an item package;
determining, by at least using the third image as input to the second machine learning model, a fourth mask associated with a third object detected in the third image, the third object corresponding to the item package; and
causing a manipulation by a robotic manipulator of the item package based at least in part on the fourth mask.