US 12,229,972 B2
Unsupervised training of optical flow estimation neural networks
Daniel Rudolf Maurer, Mountain View, CA (US); Austin Charles Stone, San Francisco, CA (US); Alper Ayvaci, Santa Clara, CA (US); Anelia Angelova, Sunnyvale, CA (US); and Rico Jonschkowski, Berlin (DE)
Assigned to Waymo LLC, Mountain View, CA (US)
Filed by Waymo LLC, Mountain View, CA (US)
Filed on Apr. 14, 2022, as Appl. No. 17/721,288.
Claims priority of provisional application 63/175,498, filed on Apr. 15, 2021.
Prior Publication US 2022/0335624 A1, Oct. 20, 2022
Int. Cl. G06T 7/246 (2017.01); G06N 3/08 (2023.01); G06T 3/18 (2024.01); G06T 5/77 (2024.01); G06T 7/174 (2017.01)
CPC G06T 7/248 (2017.01) [G06N 3/08 (2013.01); G06T 3/18 (2024.01); G06T 5/77 (2024.01); G06T 7/174 (2017.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/20132 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method performed by one or more computers and for training a neural network that has a plurality of network parameters and that is configured to receive as input a first image and a second image and to generate as output an optical flow estimate of optical flow between the first image and the second image, the method comprising:
obtaining a batch of one or more training image pairs, each training image pair comprising a respective first training image and a respective second training image;
for each of the one or more training image pairs:
processing the first training image and the second training image using the neural network to generate a final optical flow estimate from the first training image to the second training image;
generating a cropped final optical flow estimate from the first training image to the second training image, comprising cropping the final optical flow estimate from the first training image to the second training image; and
training the neural network on the one or more training image pairs, the training comprising, for each training image pair, using the cropped final optical flow estimate for the training image pair as a target output for the neural network.