CPC G06T 19/20 (2013.01) [G06N 3/04 (2013.01); G06T 7/75 (2017.01); G06T 2207/20084 (2013.01); G06T 2219/2004 (2013.01); G06T 2219/2016 (2013.01)] | 18 Claims |
1. A method for processing a video comprising m frames of images, m being a positive integer greater than or equal to 2, the method comprising:
placing a three-dimensional (3D) model on a target plane of a first frame of the video, a plurality of feature points of a model surface of the 3D model falling on the target plane;
determining a pose of a camera coordinate system of the first frame of the video relative to a world coordinate system;
for a jth frame selected from a second frame of the video to an mth frame of the video, j being an integer greater than 1 and less than or equal to m:
determining a current homography matrix, as a homography matrix of a target plane on the jth frame of the video relative to the target plane on the first frame of the video by performing:
in response to j being greater than 2: optimizing a homography matrix of a target plane on the (j−1)th frame of the video according to: a residual between a pixel value of each pixel of the (j−1)th frame and a corresponding pixel value of each pixel of a (j−2)th frame, to obtain the current homography matrix;
determining, according to the current homography matrix, pixel coordinates of the plurality of feature points of the model surface on the jth frame of the video;
determining, according to a camera intrinsic parameter of the video, and the pixel coordinates of the plurality of feature points of the model surface on the jth frame of the video, a pose of a camera coordinate system of the jth frame of the video relative to a world coordinate system; and
replacing the 3D model with a target model and placing the target model on the world coordinate system to generate, according to the pose of the camera coordinate system of each frame of the video relative to the world coordinate system, a target video comprising the target model.
|