CPC G06T 7/80 (2017.01) [G06N 3/04 (2013.01); G06T 7/75 (2017.01); G06V 30/194 (2022.01); G06T 2200/08 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30252 (2013.01); G06T 2210/12 (2013.01)] | 20 Claims |
1. A system, comprising:
a camera positioned to obtain an image of an object; and
a computer including a processor and a memory, the memory storing instructions executable by the processor to:
input the image to a neural network that outputs a three-dimensional (3D) bounding box for the object relative to a pixel coordinate system and object parameters;
then determine a center of a bottom face of the 3D bounding box in pixel coordinates, wherein the bottom face of the 3D bounding box is located in a ground plane in the image;
upon determining an intersection between a first line extending through a vanishing point for the camera and the center of the bottom face and a second line extending along a bottom boundary of the image, determine a first distance, relative to the real-world coordinate system, from the center of the bottom face to the intersection;
determine a second distance, relative to the real-world coordinate system, from the intersection to the optical axis of the camera;
based on calibration parameters for the camera that transform pixel coordinates into real-world coordinates and the first and second distances, determine a) a distance from the center of the bottom face of the 3D bounding box to the camera relative to a real-world coordinate system and b) an angle between a line extending from the camera to the center of the bottom face of the 3D bounding box and an optical axis of the camera, wherein the calibration parameters include a camera height relative to the ground plane, a camera focal distance, and a camera tilt relative to the ground plane; and
determine a six degree-of-freedom (6DoF) pose for the object based on the object parameters, the distance, and the angle.
|