US 12,462,417 B2
Method and electronic device for 3D object detection using neural networks
Danila Dmitrievich Rukhovich, Moscow (RU); Anna Borisovna Vorontsova, Moscow (RU); and Anton Sergeevich Konushin, Moscow (RU)
Assigned to SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Dec. 13, 2022, as Appl. No. 18/080,482.
Application 18/080,482 is a continuation of application No. PCT/KR2022/007472, filed on May 26, 2022.
Claims priority of application No. 2021114905 (RU), filed on May 26, 2021; and application No. 2021128885 (RU), filed on Oct. 4, 2021.
Prior Publication US 2023/0121534 A1, Apr. 20, 2023
Int. Cl. G06T 7/70 (2017.01); G06T 15/10 (2011.01); G06T 17/00 (2006.01); G06V 10/40 (2022.01); G06V 10/771 (2022.01); G06V 10/82 (2022.01); G06V 20/64 (2022.01)
CPC G06T 7/70 (2017.01) [G06T 17/00 (2013.01); G06V 10/771 (2022.01); G06V 10/82 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/30252 (2013.01); G06V 2201/07 (2022.01)] 17 Claims
OG exemplary drawing
 
1. A method of three-dimensional (3D) object detection using an object detection neural network comprising a two-dimensional (2D) feature extracting part, a 3D feature extracting part, and an outdoor object detecting part comprising parallel 2D convolutional layers for classification and location, which are pre-trained in end-to-end manner based on posed monocular images, the method comprising:
receiving one or more monocular images;
extracting 2D feature maps from each one of the one or more monocular images by passing the one or more monocular images through the 2D feature extracting part of the object detection neural network,
generating an averaged 3D voxel volume based on the 2D feature maps,
extracting a 2D representation of 3D feature maps from the averaged 3D voxel volume by passing the averaged 3D voxel volume through an encoder of the 3D feature extracting part of the object detection neural network, and
performing 3D object detection as 2D object detection in a Bird's Eye View (BEV) plane, the 2D object detection in the BEV plane being performed by passing the 2D representation of 3D feature maps through the outdoor object detecting part of the object detection neural network.