US 12,106,446 B2
	System and method of image stitching using robust camera pose estimation
Dehong Liu, Lexington, MA (US); and Laixi Shi, Pittsburgh, PA (US)
Assigned to Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US)
Filed by Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US)
Filed on Mar. 27, 2021, as Appl. No. 17/214,813.
Prior Publication US 2022/0318948 A1, Oct. 6, 2022
Int. Cl. G06T 3/4038 (2024.01); G06T 3/4007 (2024.01); G06T 3/4084 (2024.01); G06T 7/70 (2017.01)

CPC G06T 3/4038 (2013.01) [G06T 3/4007 (2013.01); G06T 3/4084 (2013.01); G06T 7/70 (2017.01); G06T 2207/30244 (2013.01)]

16 Claims

1. An image forming system for constructing a whole image of an object comprising:

an interface configured to acquire sequential images of partial areas of the object captured by a camera, wherein two images of the sequential images of the partial areas of the object include overlap portions, wherein the sequential images correspond to a three dimensional (3D) surface of the object, wherein geometrical information of the object is provided, wherein an initial pose of the camera is provided;

a memory configured to store computer-executable programs including a pose estimation method and a stitching method; and

an image stitching processor configured to perform steps of:

estimating relative camera poses of each image pair of the sequential images by solving a perspective-n-point (P-n-P) if there is an overlap between the image pair;

forming i.) a relative camera pose matrix based on the estimated relative camera poses of each image pair and ii.) a sparse pose estimation error matrix based on the relative camera pose matrix;

estimating camera poses with respect to the sequential images based on the estimated relative camera poses by solving an optimization problem that minimizes a term including a Frobenius norm of a difference between the relative camera pose matrix and the sparse pose estimation error matrix;

projecting the sequential images into a 3D surface of the object based on the estimated camera poses with respect to the sequential images;

constructing the whole image of the object as a two-dimensional (2D) image by interpolating the projected sequential images; and

outputting the constructed 2D image via the interface.