US 12,333,730 B2
Method and apparatus for scene segmentation for three-dimensional scene reconstruction
Yingen Xiong, Mountain View, CA (US); and Christopher A. Peri, Mountain View, CA (US)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Jun. 7, 2022, as Appl. No. 17/805,828.
Claims priority of provisional application 63/245,757, filed on Sep. 17, 2021.
Prior Publication US 2023/0092248 A1, Mar. 23, 2023
Int. Cl. G06T 7/00 (2017.01); G06T 7/11 (2017.01); G06T 7/50 (2017.01); G06V 10/25 (2022.01); G06V 10/32 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 20/20 (2022.01)
CPC G06T 7/11 (2017.01) [G06T 7/50 (2017.01); G06V 10/25 (2022.01); G06V 10/32 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 20/20 (2022.01); G06T 2207/20084 (2013.01); G06T 2207/30196 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for obtaining scene segmentation, the method comprising:
obtaining, from an image sensor, image data of a real-world scene;
obtaining, from a depth sensor, sparse depth data of the real-world scene;
passing the image data to a first neural network to obtain one or more object regions of interest (ROIs) and one or more feature map ROIs, wherein each object ROI comprises at least one detected object;
passing the image data and the sparse depth data to a second neural network to obtain one or more dense depth map ROIs;
aligning the one or more object ROIs, one or more feature map ROIs, and one or more dense depth map ROIs; and
passing the aligned one or more object ROIs, one or more feature map ROIs, and one or more dense depth map ROIs to a fully convolutional network to obtain a segmentation of the real-world scene, wherein the segmentation contains one or more pixelwise predictions of one or more objects in the real-world scene;
wherein aligning the one or more object ROIs, one or more feature map ROIs, and one or more dense depth map ROIs comprises resizing, using an image-guided filter, at least some of the one or more object ROIs, one or more feature map ROIs, and one or more dense depth map ROIs to a common size.