| CPC G06T 15/20 (2013.01) [G06T 7/215 (2017.01); G06T 11/001 (2013.01); G06T 15/06 (2013.01); G06V 10/44 (2022.01); G06V 10/56 (2022.01); H04N 13/117 (2018.05); G06T 2207/30241 (2013.01)] | 20 Claims |

|
1. A method performed by one or more computers, the method comprising:
receiving a video of a scene comprising a plurality of images at respective time points;
receiving a query specifying a particular time point and a new camera viewpoint; and
generating, using a view synthesis machine learning model and the video of the scene, a new image of the scene that appears to be taken from the new camera viewpoint at the particular time point, comprising:
generating, based on the particular time point, a set of source images that comprises one or more images from the video;
generating respective features for each of the source images;
for each of a plurality of pixels of the new image:
sampling a plurality of three-dimensional points along a ray corresponding to the pixel;
for each sampled point, generating, using a first neural network within the view synthesis machine learning model, data defining a motion trajectory of the sampled point around the particular time point;
generating, from the respective features for the source images, respective features for each of the sampled points using the data defining the motion trajectory of the sampled point; and
generating, from the respective features of each of the sampled points, a final color of the pixel in the new image.
|