US 12,288,294 B2
	Systems and methods for extrinsic calibration of sensors for autonomous checkout
Dhananjay Singh, San Francisco, CA (US); and Tushar Dadlani, Dublin, CA (US)
Assigned to STANDARD COGNITION, CORP., San Francisco, CA (US)
Filed by STANDARD COGNITION, CORP., San Francisco, CA (US)
Filed on Apr. 29, 2022, as Appl. No. 17/733,680.
Application 17/733,680 is a continuation in part of application No. 17/357,867, filed on Jun. 24, 2021, granted, now 11,361,468.
Claims priority of provisional application 63/045,007, filed on Jun. 26, 2020.
Prior Publication US 2022/0262069 A1, Aug. 18, 2022
Int. Cl. G06T 17/10 (2006.01); G06T 3/40 (2024.01); G06T 3/4046 (2024.01); G06T 7/80 (2017.01)

CPC G06T 17/10 (2013.01) [G06T 3/4046 (2013.01); G06T 7/80 (2017.01); G06T 2207/10028 (2013.01)]

20 Claims

1. A method for calibrating cameras in a real space for tracking puts and takes of items by subjects, the method including:

first processing of a first set of one or more images selected from a plurality of sequences of images received from a first plurality of cameras comprising an extrinsic calibration tool, in which images in the plurality of sequences of images have respective fields of view in the real space, to:

extract from the images, a three-dimensional (3D) point cloud of points captured in the images, the points corresponding to features in the real space; and

second processing of a second set of one or more images selected from a plurality of sequences of images received from a second plurality of cameras positioned at locations <(x_unk, y_unk, z_unk), . . . > comprising a camera installation in the real space; wherein images in the plurality of sequences of images received from a second plurality of cameras have respective fields of view in the real space and wherein the three-dimensional (3D) point cloud is aligned to a coordinate system (x₀, y₀, z₀) of the camera installation in the real space, to:

match, using a trained neural network classifier, a set of two-dimensional (2D) images of the second set of one or more images selected from a plurality of sequences of images received from a second plurality of cameras to corresponding portions of the three-dimensional (3D) point cloud;

determine, from differences in position of at least three points <(x₁, y₁, (x₂, y₂, z₂), (x₃, y₃, z₃)> in a matched two-dimensional (2D) image and a corresponding portion of the three-dimensional (3D) point cloud, transformation information between the matched two-dimensional (2D) image and the corresponding portion of the three-dimensional (3D) point cloud; and

apply the transformation information to image information from at least one of the second plurality of cameras to calibrate at least one physical camera at location (x_unk, y_unk, z_unk) to the coordinate system (x₀, y₀, z₀).