US 11,721,065 B2
	Monocular 3D vehicle modeling and auto-labeling using semantic keypoints
Arjun Bhargava, San Francisco, CA (US); Sudeep Pillai, Santa Clara, CA (US); Kuan-Hui Lee, San Jose, CA (US); and Kun-Hsin Chen, Mountain View, CA (US)
Assigned to TOYOTA RESEARCH INSTITUTE, INC.
Filed by TOYOTA RESEARCH INSTITUTE, INC., Los Altos, CA (US)
Filed on Aug. 25, 2022, as Appl. No. 17/895,603.
Application 17/895,603 is a continuation of application No. 17/147,049, filed on Jan. 12, 2021, granted, now 11,475,628.
Prior Publication US 2022/0414981 A1, Dec. 29, 2022
Int. Cl. G06T 17/00 (2006.01); G06T 19/20 (2011.01); G06T 7/246 (2017.01); G06T 7/33 (2017.01); G06N 3/04 (2023.01); G06T 7/50 (2017.01); G06V 20/64 (2022.01)

CPC G06T 17/00 (2013.01) [G06N 3/04 (2013.01); G06T 7/251 (2017.01); G06T 7/344 (2017.01); G06T 7/50 (2017.01); G06T 19/20 (2013.01); G06V 20/64 (2022.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30236 (2013.01); G06T 2207/30241 (2013.01); G06T 2207/30252 (2013.01); G06T 2210/12 (2013.01); G06T 2219/004 (2013.01); G06T 2219/2004 (2013.01); G06V 2201/08 (2022.01)]

16 Claims

1. A method for monocular 3D object modeling and auto-labeling with 2D semantic keypoints, comprising:

predicting, using a continuously traversable coordinate shape space (CSS) network, a normalized object coordinate space (NOCS) image and a shape vector corresponding to an input 2D labeled object;

lifting linked, 2D semantic keypoints of a 2D structured object geometry of the input 2D label object into a 3D structured object geometry;

geometrically and projectively aligning the 2D NOCS image to the 3D structured vehicle geometry and a 3D object model decoded from the shape vector;

back projecting the 2D semantic keypoints to auto-label 3D bounding boxes from the 3D object model;

enforcing geometric and projective verification constraints on the auto-labeled 3D bounding boxes to identify verified auto-labeled 3D bounding boxes and unverified auto-labeled 3D bounding boxes;

saving the verified auto-labeled 3D bounding boxes in a CSS pool and discarding the unverified auto-labeled 3D bounding boxes; and

retraining the CSS network using the saved, verified auto-labeled 3D bounding boxes.