US 12,236,676 B2
	Image extension neural networks
Mikael Pierre Bonnevie, Walnut Creek, CA (US); Aaron Maschinot, Somerville, MA (US); Aaron Sarna, Cambridge, MA (US); Shuchao Bi, Mountain View, CA (US); Jingbin Wang, Mountain View, CA (US); Michael Spencer Krainin, Arlington, MA (US); Wenchao Tong, San Jose, CA (US); Dilip Krishnan, Arlington, MA (US); Haifeng Gong, Fremont, CA (US); Ce Liu, Cambridge, MA (US); Hossein Talebi, San Jose, CA (US); Raanan Sayag, San Jose, CA (US); and Piotr Teterwak, Boston, MA (US)
Assigned to Google LLC, Mountain View, CA (US)
Appl. No. 17/438,687
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Jul. 19, 2019, PCT No. PCT/US2019/042509 § 371(c)(1), (2) Date Sep. 13, 2021, PCT Pub. No. WO2020/242508, PCT Pub. Date Dec. 3, 2020.
Claims priority of provisional application 62/854,833, filed on May 30, 2019.
Claims priority of provisional application 62/852,949, filed on May 24, 2019.
Prior Publication US 2022/0148299 A1, May 12, 2022
Int. Cl. G06K 9/00 (2022.01); G06N 3/045 (2023.01); G06T 7/10 (2017.01); G06V 10/82 (2022.01)

CPC G06V 10/82 (2022.01) [G06N 3/045 (2023.01); G06T 7/10 (2017.01); G06T 2207/20132 (2013.01)]

16 Claims

1. A method performed by one or more data processing apparatus, the method comprising:

providing an input that comprises a provided image to a generative neural network having a plurality of generative neural network parameters, wherein:

the generative neural network processes the input in accordance with trained values of the plurality of generative neural network parameters to generate an extended image;

the extended image has (i) more rows, more columns, or both than the provided image, and (ii) is predicted to be an extension of the provided image; and

the generative neural network has been trained using an adversarial loss objective function;

training the generative neural network using the adversarial loss objective function comprises processing a training input that comprises a training image using the generative neural network;

the generative neural network is jointly trained with a discriminative neural network having a plurality of discriminative neural network parameters that is configured to process a given image to generate a discriminative output characterizing a likelihood that the given image was generated using the generative neural network; and

the discriminative neural network is conditioned on a semantic feature representation of a target image, the training image being a cropped representation of the target image, wherein the semantic feature representation of the target image is provided as either: an additional input to the discriminative neural network; or the discriminative output is determined based at least in part on a similarity measure between the semantic feature representation of a target extended image and an intermediate output of the discriminative neural network, and wherein the semantic feature representation of the target extended image is normalized by a normalization engine prior to using it to condition the discriminative neural network.