US 11,727,589 B2
	System and method to improve multi-camera monocular depth estimation using pose averaging
Vitor Guizilini, Santa Clara, CA (US); Rares Andrei Ambrus, Santa Clara, CA (US); Adrien David Gaidon, San Francisco, CA (US); Igor Vasiljevic, Chicago, IL (US); and Gregory Shakhnarovich, Chicago, IL (US)
Assigned to TOYOTA RESEARCH INSTITUTE, INC., Los Altos, CA (US)
Filed by TOYOTA RESEARCH INSTITUTE, INC., Los Altos, CA (US)
Filed on Jul. 16, 2021, as Appl. No. 17/377,684.
Claims priority of provisional application 63/161,614, filed on Mar. 16, 2021.
Prior Publication US 2022/0301206 A1, Sep. 22, 2022
Int. Cl. G06T 7/55 (2017.01); B60R 1/00 (2022.01); G06T 3/00 (2006.01); G05D 1/02 (2020.01); G06N 3/08 (2023.01); G06T 7/579 (2017.01); G06T 7/292 (2017.01); G06T 7/11 (2017.01); B60W 60/00 (2020.01); G06T 3/40 (2006.01); G06F 18/214 (2023.01); H04N 23/90 (2023.01)

CPC G06T 7/55 (2017.01) [B60R 1/00 (2013.01); B60W 60/001 (2020.02); G05D 1/0212 (2013.01); G05D 1/0246 (2013.01); G06F 18/214 (2023.01); G06F 18/2148 (2023.01); G06N 3/08 (2013.01); G06T 3/0012 (2013.01); G06T 3/0093 (2013.01); G06T 3/40 (2013.01); G06T 7/11 (2017.01); G06T 7/292 (2017.01); G06T 7/579 (2017.01); H04N 23/90 (2023.01); B60R 2300/102 (2013.01); B60W 2420/42 (2013.01); G05D 2201/0213 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30244 (2013.01); G06T 2207/30252 (2013.01)]

20 Claims

1. A method for multi-camera monocular depth estimation using pose averaging, comprising:

determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle;

determining a multi-camera pose consistency constraint (PCC) loss associated with the multi-camera rig of the ego vehicle;

adjusting the multi-camera photometric loss according to the multi-camera PCC loss to form a multi-camera PCC photometric loss;

training a multi-camera depth estimation model and an ego-motion estimation model according to the multi-camera PCC photometric loss; and

predicting a 360° point cloud of a scene surrounding the ego vehicle according to the trained multi-camera depth estimation model and the ego-motion estimation model.