US 12,265,574 B2
	Using interpolation to generate a video from static images
Janne Kontkanen, San Francisco, CA (US); Jamie Aspinall, Mountain View, CA (US); Dominik Kaeser, New York City, NY (US); Navin Sarma, Palo Alto, CA (US); Brian Curless, Seattle, WA (US); and David Salesin, Sausalito, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Dec. 20, 2023, as Appl. No. 18/391,262.
Application 18/391,262 is a continuation of application No. 17/566,462, filed on Dec. 30, 2021, granted, now 11,893,056.
Claims priority of provisional application 63/190,234, filed on May 18, 2021.
Prior Publication US 2024/0126810 A1, Apr. 18, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/75 (2019.01); G06F 16/738 (2019.01); G06F 16/78 (2019.01); G06N 20/00 (2019.01); G06T 7/20 (2017.01); H04N 5/262 (2006.01)

CPC G06F 16/739 (2019.01) [G06F 16/75 (2019.01); G06F 16/7867 (2019.01); G06N 20/00 (2019.01); G06T 7/20 (2013.01); H04N 5/2628 (2013.01)]

20 Claims

1. A computer-implemented method comprising:

selecting, from a collection of images associated with a user account, candidate pairs of images, wherein each candidate pair includes a first static image of a scene and a second static image of the scene from the user account;

applying a filter to select a particular pair of images from the candidate pairs of images based on the particular pair of images failing to meet a threshold similarity;

generating, using an image interpolator, one or more intermediate images based on the particular pair of images;

providing the first static image as input to a depth machine-learning model, the depth machine-learning model outputting a three-dimensional representation of the scene; and

generating a video that includes three or more frames arranged in a sequence, wherein a first frame of the sequence is the first static image, a last frame of the sequence is the second static image, and each of the one or more intermediate images is a corresponding intermediate frame of the sequence between the first frame and the last frame, wherein the video includes the three-dimensional representation of the scene.