| CPC G06T 17/00 (2013.01) [G06T 13/40 (2013.01); G06T 2207/20044 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30196 (2013.01)] | 10 Claims |

|
1. A method for generating a three-dimensional (3D) global pose from an image, the method being implemented using a system that includes a computing device and a camera which is configured to capture the image, and comprising:
a) by the computing device, receiving the image from the camera, and performing a detection operation to detect a human body in the image;
b) by the computing device, using a first neural network, based on the image in which the human body has been detected, to obtain a two-dimensional (2D) heatmap that is related to a skeleton structure of the human body and that includes a plurality of human keypoints corresponding to a plurality of joints of the human body, and performing a regression operation on the 2D heatmap to obtain a plurality of 2D coordinate sets each associated with one of the joints, and indicating a position of a corresponding one of the human keypoints in a 2D coordinate system of the 2D heatmap;
c) by the computing device, using a second neural network to perform a 3D human pose estimation operation on the plurality of 2D coordinate sets, so as to obtain a 3D human pose that is related to the skeleton structure of the human body in a local coordinate system, and that includes a plurality of 3D keypoints corresponding to the plurality of human keypoints, respectively; and
d) by the computing device based on the 3D human pose, using a numerical optimization solver to generate the 3D global pose in a world coordinate system, the numerical optimization solver being Ceres solver,
wherein step d) includes
d-1) generating an initial guess shape for the 3D global pose by using Levenberg-Marquardt (LM) algorithm and a dense QR factorization to obtain plural sets of coordinates in the world coordinate system for the plurality of 3D keypoints, respectively, the initial guess shape including the plural sets of coordinates,
d-2) performing an iterative fitting procedure on the initial guess shape with reference to the plurality of 2D coordinate sets using the numerical optimization solver, so as to generate an updated shape that includes a plurality of 3D coordinate sets which respectively correspond to the joints of the human body,
d-3) calculating an error associated with the updated shape, the calculation being implemented based on a difference between projections of the plurality of 3D coordinate sets of the updated shape onto the 2D coordinate system and the plurality of 2D coordinate sets, a relationship between each foot of the updated shape and a floor in the world coordinate system, and an acceleration associated with each of the joints of the updated shape relative to a previous global 3D pose generated with respect to a previous image,
d-4) determining whether the error indicates that the updated shape is an optimized shape,
d-5) when it is determined that the error indicates that the updated shape is the optimized shape, using the updated shape as the 3D global pose, and
when it is determined that the error indicates that the updated shape is not the optimized shape, using the updated shape as the initial guess shape, and repeating sub-steps d-2) to d-5).
|