| CPC G06T 7/70 (2017.01) [G06T 17/00 (2013.01); G06V 10/771 (2022.01); G06V 10/82 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/30252 (2013.01); G06V 2201/07 (2022.01)] | 17 Claims |

|
1. A method of three-dimensional (3D) object detection using an object detection neural network comprising a two-dimensional (2D) feature extracting part, a 3D feature extracting part, and an outdoor object detecting part comprising parallel 2D convolutional layers for classification and location, which are pre-trained in end-to-end manner based on posed monocular images, the method comprising:
receiving one or more monocular images;
extracting 2D feature maps from each one of the one or more monocular images by passing the one or more monocular images through the 2D feature extracting part of the object detection neural network,
generating an averaged 3D voxel volume based on the 2D feature maps,
extracting a 2D representation of 3D feature maps from the averaged 3D voxel volume by passing the averaged 3D voxel volume through an encoder of the 3D feature extracting part of the object detection neural network, and
performing 3D object detection as 2D object detection in a Bird's Eye View (BEV) plane, the 2D object detection in the BEV plane being performed by passing the 2D representation of 3D feature maps through the outdoor object detecting part of the object detection neural network.
|