CPC G06T 17/00 (2013.01) [G06T 7/11 (2017.01); G06T 7/70 (2017.01); G06V 10/25 (2022.01); G06V 10/761 (2022.01); G06V 10/82 (2022.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01)] | 35 Claims |
1. A computer-implemented method comprising:
receiving a plurality of colour images of a given real-world environment, a plurality of depth images corresponding to the plurality of colour images, and viewpoint information indicative of corresponding viewpoints from which the plurality of colour images and the plurality of depth images are captured, wherein three-dimensional (3D) positions and orientations of the viewpoints are represented in a given coordinate system;
dividing a 3D space occupied by the given real-world environment into at least one 3D grid of voxels, wherein the at least one 3D grid is represented in the given coordinate system;
creating at least one 3D data structure comprising a plurality of nodes, each node representing a corresponding voxel of the 3D space occupied by the given real-world environment;
dividing a given colour image and a given depth image corresponding to the given colour image into a plurality of colour tiles and a plurality of depth tiles, respectively, wherein the plurality of depth tiles correspond to respective ones of the plurality of colour tiles;
mapping a given colour tile of the given colour image to at least one voxel in the at least one 3D grid whose colour information is captured in the given colour tile, based on depth information captured in a corresponding depth tile of the given depth image and a given viewpoint from which the given colour image and the given depth image are captured;
storing, in a given node of the at least one 3D data structure representing the at least one voxel, given viewpoint information indicative of the given viewpoint from which the given colour image and the given depth image are captured, along with any of:
(i) the given colour tile of the given colour image that captures the colour information of the at least one voxel and the corresponding depth tile of the given depth image that captures the depth information of the at least one voxel,
(ii) reference information indicative of unique identification of the given colour tile and the corresponding depth tile; and
utilising the at least one 3D data structure for training at least one neural network, wherein a given input of the at least one neural network comprises information indicative of a 3D position of a given point in the given real-world environment and a given output of the at least one neural network comprises a colour and an opacity of the given point.
|