US 12,272,097 B1
Camera pose estimation method and system, electronic equipment and readable medium
Shuyuan Lin, Guangzhou (CN); Xiaocheng Lin, St. Paul, MN (US); Feiran Huang, Guangzhou (CN); and Tingrong Zhi, Guangzhou (CN)
Assigned to Jinan University;, Guangzhou (CN); and Macalester College, St. Paul, MN (US)
Filed by Jinan University, Guangzhou (CN); and Macalester College, St. Paul, MN (US)
Filed on Oct. 10, 2024, as Appl. No. 18/911,702.
Claims priority of application No. 202311472686.X (CN), filed on Nov. 7, 2023.
Int. Cl. G06T 7/73 (2017.01); G06T 7/33 (2017.01)
CPC G06T 7/74 (2017.01) [G06T 7/337 (2017.01); G06T 2207/20076 (2013.01); G06T 2207/30244 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A camera pose estimation method, comprising:
acquiring an initial matching set between a first image and a second image, wherein the first image and the second image are images from different angles for a same scene;
performing a mismatch removal operation on the initial matching set based on an optimization network to obtain an optimized matching set, wherein the optimization network is constructed based on a multi-stage geometric semantic attention network; the multi-stage geometric semantic attention network comprises a plurality of stage networks sequentially connected in series; adjacent stage networks are connected to each other through a geometric transformation consistency processor and a geometric semantic attention processor; the stages comprise a first multi-branch processor, a sequential perception filtering processor, and a second multi-branch processor that are sequentially connected in series; the first multi-branch processor and the second multi-branch processor are constructed using a multi-branch structure; the multi-branch structure comprises a first transformation processor MBSE and a second transformation processor MBMS; the geometric transformation consistency processor is configured to extract geometric transformation consistency information of output data of a previous stage network; and the geometric semantic attention network is configured to acquire geometric semantic neighbor information of input data of a next stage network; and
acquiring a camera pose result based on the optimized matching set.