CPC G06T 7/579 (2017.01) [G06F 18/2163 (2023.01); G06F 18/24 (2023.01); G06N 3/04 (2013.01); G06T 7/60 (2013.01); G06T 7/73 (2017.01); G06T 19/006 (2013.01); G06V 10/454 (2022.01); G06V 20/20 (2022.01); G06V 20/647 (2022.01); G06T 2207/20084 (2013.01)] | 18 Claims |
1. A computer-implemented method, the method comprising:
receiving an image of an object captured by a camera;
processing the image of the object using an object recognition neural network that is configured to generate an object recognition output comprising:
data defining a predicted two-dimensional amodal center of the object, wherein the predicted two-dimensional amodal center of the object is a projection of a predicted three-dimensional center of the object under a camera pose of the camera that captured the image;
obtaining data specifying one or more other predicted two-dimensional amodal centers of the object in one or more other images captured under different camera poses; and
determining, from (i) the predicted two-dimensional amodal center of the object in the image and (ii) the one or more other predicted two-dimensional amodal centers of the object, the predicted three-dimensional center of the object.
|