US 12,340,466 B2
Multiresolution neural networks for 3D reconstruction
Mikko Strandborg, Helsinki (FI); and Kimmo Roimela, Helsinki (FI)
Assigned to Varjo Technologies Oy, Helsinki (JP)
Filed by Varjo Technologies Oy, Helsinki (FI)
Filed on Apr. 25, 2023, as Appl. No. 18/306,295.
Prior Publication US 2024/0362862 A1, Oct. 31, 2024
Int. Cl. G06T 17/00 (2006.01); G06F 3/01 (2006.01); G06T 7/90 (2017.01); G06V 10/25 (2022.01); G06V 10/74 (2022.01); G06V 10/82 (2022.01)
CPC G06T 17/005 (2013.01) [G06F 3/013 (2013.01); G06T 7/90 (2017.01); G06V 10/25 (2022.01); G06V 10/761 (2022.01); G06V 10/82 (2022.01); G06T 2207/10024 (2013.01); G06T 2207/20081 (2013.01)] 23 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving a plurality of colour images of a given real-world environment, and viewpoint information indicative of corresponding viewpoints from which the plurality of colour images are captured;
utilising a hierarchical data structure to represent a 3D space occupied by the given real-world environment at a plurality of granularity levels, the hierarchical data structure comprising a plurality of nodes, wherein the plurality of nodes comprise different sets of nodes at respective ones of the plurality of granularity levels;
training a plurality of neural networks for 3D reconstruction of objects represented by respective ones of the plurality of nodes, based on the plurality of colour images and the viewpoint information, wherein the plurality of neural networks comprise different sets of neural networks corresponding to the different sets of nodes at the respective ones of the plurality of granularity levels; and
for a given portion of an output image that is to be reconstructed from a perspective of a new viewpoint,
determining a granularity level at which the given portion of the output image is to be reconstructed, based on at least one of: a resolution at which the given portion is being reconstructed, a distance of the new viewpoint from objects being represented in the given portion, whether the given portion corresponds to a user's gaze;
identifying a given node in the hierarchical data structure that corresponds to a given region of the 3D space within which said objects lie, wherein the given node has different sets of child nodes;
selecting a set of child nodes, from amongst the different sets of child nodes, that is at the granularity level at which the given portion of the output image is to be reconstructed; and
for a given child node of the selected set of child nodes, utilising a cascade of neural networks that ends at a neural network corresponding to the given child node, to reconstruct the given portion of the output image, wherein a granularity level of an N+1th neural network in the cascade is higher than a granularity level of an Nth neural network in the cascade, further wherein an input of a given neural network in said cascade comprises outputs of at least a predefined number of previous neural networks in said cascade.