US 11,954,899 B2
	Systems and methods for training models to predict dense correspondences in images using geodesic distances
Yinda Zhang, Daly City, CA (US); Feitong Tan, Beijing (CN); Danhang Tang, West Hollywood, CA (US); Mingsong Dou, Cupertino, CA (US); Kaiwen Guo, Zurich (CH); Sean Ryan Francesco Fanello, San Francisco, CA (US); Sofien Bouaziz, Los Gatos, CA (US); Cem Keskin, San Francisco, CA (US); Ruofei Du, San Francisco, CA (US); Rohit Kumar Pandey, Mountain View, CA (US); and Deqing Sun, Cambridge, MA (US)
Assigned to GOOGLE LLC, Moutain View, CA (US)
Appl. No. 18/274,371
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Mar. 11, 2021, PCT No. PCT/CN2021/080137 § 371(c)(1), (2) Date Jul. 26, 2023, PCT Pub. No. WO2022/188086, PCT Pub. Date Sep. 15, 2022.
Prior Publication US 2024/0046618 A1, Feb. 8, 2024
Int. Cl. G06V 10/771 (2022.01); G06T 7/70 (2017.01); G06T 17/00 (2006.01); G06V 10/44 (2022.01); G06V 10/75 (2022.01)

CPC G06V 10/771 (2022.01) [G06T 7/70 (2017.01); G06T 17/00 (2013.01); G06V 10/44 (2022.01); G06V 10/751 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)]

20 Claims

1. A method of training a neural network to predict correspondences in images, the method comprising:

generating, by one or more processors of a processing system and using the neural network, a first feature map based on a first image of a subject, and a second feature map based on a second image of the subject, the first image and the second image being different and having been generated using a three-dimensional model of the subject;

determining, by the one or more processors, a first feature distance between a first point as represented in the first feature map and a second point as represented in the second feature map, the first point and the second point corresponding to the same feature on the three-dimensional model of the subject;

determining, by the one or more processors, a second feature distance between a third point as represented in the first feature map and a fourth point as represented in the first feature map;

determining, by the one or more processors, a first geodesic distance between the third point and the fourth point as represented in a first surface map, the first surface map corresponding to the first image and having been generated using the three-dimensional model of the subject;

determining, by the one or more processors, a third feature distance between the third point as represented in the first feature map and a fifth point as represented in the first feature map;

determining, by the one or more processors, a second geodesic distance between the third point and the fifth point as represented in the first surface map;

determining, by the one or more processors, a first loss value of a set of loss values, the first loss value being based on the first feature distance;

determining, by the one or more processors, a second loss value of the set of loss values, the second loss value being based on the second feature distance, the third feature distance, the first geodesic distance, and the second geodesic distance; and

modifying, by the one or more processors, one or more parameters of the neural network based at least in part on the set of loss values.