CPC G06T 17/10 (2013.01) | 20 Claims |
1. A method for rendering, comprising:
modeling a three-dimensional (3D) scene as a four-dimensional (4D) tensor, a first dimension, a second dimension, and a third dimension in the 4D tensor corresponding to an X-Y-Z coordinate axis in the 3D scene, a fourth dimension representing a channel dimension corresponding to an encoded feature, and the encoded feature being obtained by encoding the 3D scene with an encoder corresponding to a decoder;
performing interpolation on the 4D tensor to obtain streaming scene information associated with the 3D scene; and
rendering the 3D scene on the basis of the streaming scene information;
wherein the encoder comprises a discrete variational auto-encoder configured to operate as a tokenizer, where the discrete variational auto-encoder receives respective input two-dimensional (2D) images of the 3D scene, and generates from the input 2D images respective ones of a plurality of series of discrete tokens representing respective portions of the encoded feature;
wherein the decoder processes the plurality of series of discrete tokens representing respective portions of the encoded feature to generate one or more output 2D images of the 3D scene; and
wherein the encoder and decoder are trained based at least in part on a loss computed between one or more of the input 2D images and respective corresponding ones of the output 2D images.
|