US 12,141,916 B2
Markerless motion capture of hands with multiple pose estimation engines
Colin Joseph Brown, Montreal (CA); Wenxin Zhang, Montreal (CA); and Dalei Wang, Montreal (CA)
Assigned to Hinge Health, Inc., San Francisco, CA (US)
Appl. No. 17/906,854
Filed by Hinge Health, Inc., San Francisco, CA (US)
PCT Filed Mar. 20, 2020, PCT No. PCT/IB2020/052600
§ 371(c)(1), (2) Date Sep. 20, 2022,
PCT Pub. No. WO2021/186222, PCT Pub. Date Sep. 23, 2021.
Prior Publication US 2023/0141494 A1, May 11, 2023
Int. Cl. G06T 17/00 (2006.01); G06T 7/70 (2017.01); G06T 11/00 (2006.01); G06V 10/25 (2022.01)
CPC G06T 17/00 (2013.01) [G06T 7/70 (2017.01); G06T 11/00 (2013.01); G06V 10/25 (2022.01); G06T 2207/20016 (2013.01)] 18 Claims
OG exemplary drawing
 
1. An apparatus for generating a three-dimensional (3D) skeleton with coarse and fine grain regions, the apparatus comprising:
a camera configured to capture an image of a subject;
a first pose estimation engine configured to:
receive the image,
generate a coarse skeleton of the image using a first convolutional neural network that infers body joint positions of the subject, and
identify a region of the image based on the coarse skeleton;
a second pose estimation engine configured to:
receive the region of the image, and
generate a fine skeleton of the region of the image using a second convolutional neural network that infers hand joint positions of a hand of the subject;
an attachment engine to generate a whole skeleton, wherein the whole skeleton includes the fine skeleton attached to the coarse skeleton; and
a communications interface to transmit the whole skeleton to an aggregator, wherein the aggregator is to generate a three-dimensional skeleton based on the whole skeleton and additional data.