| CPC G06V 10/82 (2022.01) [G06V 10/774 (2022.01)] | 20 Claims |

|
1. A method comprising:
obtaining a set of one or more training images and, for each training image, ground truth instance data that identifies, for each of one or more object instances, a corresponding region of the training image that depicts the object instance;
for each training image in the set:
processing the training image using an instance segmentation neural network to generate an embedding output comprising a respective embedding for each of a first plurality of first output pixels; and
training the instance segmentation neural network to minimize a loss function that includes a first term that, for each of the training images in the set for each of the one or more object instances depicted in the training image, encourages embeddings within positive embedding pairs to be more similar than embeddings within negative embedding pairs, wherein the positive embedding pairs include first embedding pairs that include two output pixels within the region of the training image that depicts the object instance and the negative embedding pairs include first negative embedding pairs that include one output pixel within the region of the training image that depicts the object instance and another output pixel that is not within the region of the training image that depicts the object instance.
|