US 12,175,618 B2
Image processing method and apparatus, electronic device, and computer-readable storage medium
Yuanli Zheng, Shenzhen (CN); Zhaopeng Gu, Shenzhen (CN); and Nianhua Xie, Shenzhen (CN)
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed by Tencent Technology (Shenzhen) Company Limited, Shenzhen (CN)
Filed on Nov. 14, 2022, as Appl. No. 17/986,322.
Application 17/986,322 is a continuation of application No. 17/185,393, filed on Feb. 25, 2021, granted, now 11,538,229.
Application 17/185,393 is a continuation of application No. PCT/CN2020/111638, filed on Aug. 27, 2020.
Claims priority of application No. 201910854877.X (CN), filed on Sep. 10, 2019.
Prior Publication US 2023/0075270 A1, Mar. 9, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 19/20 (2011.01); G06N 3/04 (2023.01); G06T 7/73 (2017.01)
CPC G06T 19/20 (2013.01) [G06N 3/04 (2013.01); G06T 7/75 (2017.01); G06T 2207/20084 (2013.01); G06T 2219/2004 (2013.01); G06T 2219/2016 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for processing a video comprising m frames of images, m being a positive integer greater than or equal to 2, the method comprising:
placing a three-dimensional (3D) model on a target plane of a first frame of the video, a plurality of feature points of a model surface of the 3D model falling on the target plane;
determining a pose of a camera coordinate system of the first frame of the video relative to a world coordinate system;
for a jth frame selected from a second frame of the video to an mth frame of the video, j being an integer greater than 1 and less than or equal to m:
determining a current homography matrix, as a homography matrix of a target plane on the jth frame of the video relative to the target plane on the first frame of the video by performing:
in response to j being greater than 2: optimizing a homography matrix of a target plane on the (j−1)th frame of the video according to: a residual between a pixel value of each pixel of the (j−1)th frame and a corresponding pixel value of each pixel of a (j−2)th frame, to obtain the current homography matrix;
determining, according to the current homography matrix, pixel coordinates of the plurality of feature points of the model surface on the jth frame of the video;
determining, according to a camera intrinsic parameter of the video, and the pixel coordinates of the plurality of feature points of the model surface on the jth frame of the video, a pose of a camera coordinate system of the jth frame of the video relative to a world coordinate system; and
replacing the 3D model with a target model and placing the target model on the world coordinate system to generate, according to the pose of the camera coordinate system of each frame of the video relative to the world coordinate system, a target video comprising the target model.