CPC G06V 20/64 (2022.01) [G05D 1/0251 (2013.01); G06F 18/214 (2023.01); G06F 18/2163 (2023.01); G06V 20/41 (2022.01); G06V 20/46 (2022.01); G06V 20/56 (2022.01); G05D 2201/0213 (2013.01); G06N 3/04 (2013.01); G06V 2201/08 (2022.01)] | 14 Claims |
1. A method for 3D object detection, comprising:
detecting semantic keypoints from monocular images of a video stream capturing a 3D object;
inferring 3D bounding boxes of the 3D object by indexing the inferred 3D bounding box according to predicted keypoint coordinates corresponding to the detected semantic keypoints;
scoring the inferred 3D bounding boxes of the 3D object according to an objectness score, an object classification score, and 10D bounding box parameters predicted according to the predicted coordinates of the detected semantic keypoints;
discarding overlapping ones of the inferred 3D bounding boxes as redundant based on a user-defined overlap threshold using non-maxima suppression to determine a final set of 3D bounding boxes; and
detecting the 3D object according to the final set of 3D bounding boxes generated based on the scoring of the inferred 3D bounding boxes using score-thresholding and the non-maxima suppression.
|