| CPC G06T 17/10 (2013.01) | 20 Claims | 

| 
               1. A method for rendering, comprising: 
            modeling a three-dimensional (3D) scene as a four-dimensional (4D) tensor, a first dimension, a second dimension, and a third dimension in the 4D tensor corresponding to an X-Y-Z coordinate axis in the 3D scene, a fourth dimension representing a channel dimension corresponding to an encoded feature, and the encoded feature being obtained by encoding the 3D scene with an encoder corresponding to a decoder; 
                performing interpolation on the 4D tensor to obtain streaming scene information associated with the 3D scene; and 
                rendering the 3D scene on the basis of the streaming scene information; 
                wherein the encoder comprises a discrete variational auto-encoder configured to operate as a tokenizer, where the discrete variational auto-encoder receives respective input two-dimensional (2D) images of the 3D scene, and generates from the input 2D images respective ones of a plurality of series of discrete tokens representing respective portions of the encoded feature; 
                wherein the decoder processes the plurality of series of discrete tokens representing respective portions of the encoded feature to generate one or more output 2D images of the 3D scene; and 
                wherein the encoder and decoder are trained based at least in part on a loss computed between one or more of the input 2D images and respective corresponding ones of the output 2D images. 
               |