US 12,190,444 B2
Image-based environment reconstruction with view-dependent colour
Mikko Strandborg, Hangonkylä (FI); and Kimmo Roimela, Tampere (FI)
Assigned to Varjo Technologies Oy, Helsinki (FI)
Filed by Varjo Technologies Oy, Helsinki (FI)
Filed on Feb. 17, 2023, as Appl. No. 18/111,299.
Prior Publication US 2024/0282050 A1, Aug. 22, 2024
Int. Cl. G06T 17/00 (2006.01); G06T 7/11 (2017.01); G06T 7/13 (2017.01); G06T 7/90 (2017.01); G06V 10/74 (2022.01); G06V 10/82 (2022.01)
CPC G06T 17/00 (2013.01) [G06T 7/11 (2017.01); G06T 7/13 (2017.01); G06T 7/90 (2017.01); G06V 10/761 (2022.01); G06V 10/82 (2022.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
obtaining a three-dimensional (3D) data structure comprising a plurality of nodes, each node representing a corresponding voxel of a 3D grid of voxels into which a 3D space occupied by a given real-world environment is divided, wherein a given node of the 3D data structure stores given viewpoint information indicative of a given viewpoint from which a given colour image and a given depth image are captured, along with any of:
(i) a given colour tile of the given colour image that captures colour information of a given voxel represented by the given node and a corresponding depth tile of the given depth image that captures depth information of the given voxel from a perspective of the given viewpoint,
(ii) reference information indicative of unique identification of the given colour tile and the corresponding depth tile;
utilising the 3D data structure for training at least one neural network, wherein a given input of the at least one neural network comprises information indicative of a 3D position of a given point in the given real-world environment and a given output of the at least one neural network comprises a colour and an opacity of the given point; and
for a new viewpoint from a perspective of which a given output colour image is to be reconstructed,
determining a set of visible nodes in the 3D data structure whose corresponding voxels are visible from the new viewpoint;
for a given visible node of said set, selecting, from amongst depth tiles of the given visible node, at least one depth tile whose corresponding viewpoint matches the new viewpoint most closely;
reconstructing, from depth tiles that are selected for each visible node of said set, a two-dimensional (2D) geometry of objects represented by pixels of the given output colour image; and
utilising the at least one neural network to render colours for the pixels of the given output colour image.