US 12,272,122 B1
	Techniques for optimizing object detection frameworks
Stefan Matcovici, Iasi (RO); Alin-Ionut Popa, Bucharest (RO); and Daniel Voinea, Vlǎdiceasca (RO)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Aug. 11, 2022, as Appl. No. 17/886,271.
Int. Cl. G06V 10/774 (2022.01); G06V 10/22 (2022.01); G06V 10/74 (2022.01); G06V 10/75 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01)

CPC G06V 10/774 (2022.01) [G06V 10/22 (2022.01); G06V 10/759 (2022.01); G06V 10/761 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01)]

20 Claims

1. A computer-implemented method, comprising:

receiving, by a computing device, proposed region data identifying a region within an image and a corresponding feature representation associated with the region, the proposed region data being generated, at least in part, by a region proposal neural network of an object detection framework, the object detection framework comprising the region proposal neural network and a convolutional neural network that comprises a classifier and a regressor;

obtaining a set of novel images, each of the set of novel images being associated with a respective classification label, the respective classification label being different from classification labels previously used to train the classifier of the object detection framework;

selecting a subset of novel images from the set of novel images based at least in part on determining a degree of similarity between the proposed region data and each of the set of novel images;

generating, from the subset of novel images, a probability distribution indicating 1) a set of classification labels associated with the subset of novel images and 2) probability values corresponding to each of the set of classification labels;

generating, from the subset of novel images, a weighted average of corresponding feature representations for each of the subset of novel images;

executing first operations to cause the classifier to generate first output based at least in part on the probability distribution, the first output identifying one or more classification labels for the image; and

executing second operations to cause the regressor to generate second output based at least in part on the weighted average of the corresponding feature representations for each of the subset of novel images, the second output identifying one or more bounding boxes within the image, the first output and the second output being correlated to identify one or more objects and corresponding locations of the one or more objects within the image.