US 12,475,638 B2
Volumetric performance capture with neural rendering
Sean Ryan Francesco Fanello, San Francisco, CA (US); Abhi Meka, Redwood City, CA (US); Rohit Kumar Pandey, Mountain View, CA (US); Christian Haene, Berkeley, CA (US); Sergio Orts Escolano, San Francisco, CA (US); Christoph Rhemann, Marina Del Rey, CA (US); Paul Debevec, Culver City, CA (US); Sofien Bouaziz, Los Gatos, CA (US); Thabo Beeler, Zurich (CH); Ryan Overbeck, San Francisco, CA (US); Peter Barnum, Mountain View, CA (US); Daniel Erickson, San Francisco, CA (US); Philip Davidson, Arlington, MA (US); Yinda Zhang, Palo Alto, CA (US); Jonathan Taylor, New York, NY (US); Chloe LeGENDRE, Culver City, CA (US); and Shahram Izadi, San Francisco, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Appl. No. 18/251,743
Filed by GOOGLE LLC, Mountain View, CA (US)
PCT Filed Nov. 5, 2020, PCT No. PCT/US2020/059067
§ 371(c)(1), (2) Date May 4, 2023,
PCT Pub. No. WO2022/098358, PCT Pub. Date May 12, 2022.
Prior Publication US 2023/0419600 A1, Dec. 28, 2023
Int. Cl. G06T 15/50 (2011.01); G06T 7/55 (2017.01); G06T 7/60 (2017.01); G06T 15/04 (2011.01); G06T 15/20 (2011.01)
CPC G06T 15/506 (2013.01) [G06T 7/55 (2017.01); G06T 7/60 (2013.01); G06T 15/04 (2013.01); G06T 15/20 (2013.01); G06T 2207/10048 (2013.01); G06T 2207/10152 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30196 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
obtaining, using a camera system and a light stage having a plurality of lights, a plurality of images that depict a subject from a plurality of viewpoints and under a plurality of lighting conditions;
obtaining, using a plurality of infrared cameras, depth data corresponding to the subject;
based on the depth data corresponding to the subject, extracting, using a neural network, a plurality of features of the subject from the plurality of images;
pooling, using the neural network, the plurality of features of the subject into a texture space;
reprojecting the pooled features into an image space;
providing the pooled features reprojected into the image space with one or more graphical buffers as inputs to a neural renderer; and
generating, using the neural renderer, an output image depicting the subject from a target view such that illumination of the subject in the output image aligns with the target view.