US 12,243,273 B2
Neural 3D video synthesis
Zhaoyang Lv, Redmond, WA (US); Miroslava Slavcheva, Seattle, WA (US); Tianye Li, Los Angeles, CA (US); Michael Zollhoefer, Pittsburgh, PA (US); Simon Gareth Green, Deptford (GB); Tanner Schmidt, Seattle, WA (US); Michael Goesele, Woodinville, WA (US); Steven John Lovegrove, Woodinville, WA (US); Christoph Lassner, San Francisco, CA (US); and Changil Kim, Seattle, WA (US)
Assigned to META PLATFORMS TECHNOLOGIES, LLC, Menlo Park, CA (US)
Filed by META PLATFORMS TECHNOLOGIES, LLC, Menlo Park, CA (US)
Filed on Jan. 7, 2022, as Appl. No. 17/571,285.
Claims priority of provisional application 63/142,234, filed on Jan. 27, 2021.
Prior Publication US 2022/0239844 A1, Jul. 28, 2022
Int. Cl. G06T 7/00 (2017.01)
CPC G06T 7/97 (2017.01) [G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A method comprising:
rendering output frames for an output video of a scene, wherein each output frame is rendered by querying an updated neural radiance field (NeRF) using at least one updated latent code respectively associated with a desired time associated with the output frame, a desired viewpoint for the output frame, and ray directions associated with pixels in the output frame, wherein the updated NeRF and the at least one updated latent code are based on:
a set of pixels selected from a plurality of pixels of at least two frames in a training video for the scene captured by a camera, the set of pixels selected based on temporal variances of the plurality of pixels,
rendered pixel values for the set of pixels that were identified by querying a pre-trained NeRF using:
ray directions associated with the set of pixels,
initialized latent codes respectively associated with times associated with the at least two frames in the training video, and
a first camera viewpoint associated with the at least two frames; and
a comparison between the rendered pixel values and original pixel values for the set of pixels.