US 12,141,916 B2
	Markerless motion capture of hands with multiple pose estimation engines
Colin Joseph Brown, Montreal (CA); Wenxin Zhang, Montreal (CA); and Dalei Wang, Montreal (CA)
Assigned to Hinge Health, Inc., San Francisco, CA (US)
Appl. No. 17/906,854
Filed by Hinge Health, Inc., San Francisco, CA (US)
PCT Filed Mar. 20, 2020, PCT No. PCT/IB2020/052600 § 371(c)(1), (2) Date Sep. 20, 2022, PCT Pub. No. WO2021/186222, PCT Pub. Date Sep. 23, 2021.
Prior Publication US 2023/0141494 A1, May 11, 2023
Int. Cl. G06T 17/00 (2006.01); G06T 7/70 (2017.01); G06T 11/00 (2006.01); G06V 10/25 (2022.01)

CPC G06T 17/00 (2013.01) [G06T 7/70 (2017.01); G06T 11/00 (2013.01); G06V 10/25 (2022.01); G06T 2207/20016 (2013.01)]

18 Claims

1. An apparatus for generating a three-dimensional (3D) skeleton with coarse and fine grain regions, the apparatus comprising:

a camera configured to capture an image of a subject;

a first pose estimation engine configured to:

receive the image,

generate a coarse skeleton of the image using a first convolutional neural network that infers body joint positions of the subject, and

identify a region of the image based on the coarse skeleton;

a second pose estimation engine configured to:

receive the region of the image, and

generate a fine skeleton of the region of the image using a second convolutional neural network that infers hand joint positions of a hand of the subject;

an attachment engine to generate a whole skeleton, wherein the whole skeleton includes the fine skeleton attached to the coarse skeleton; and

a communications interface to transmit the whole skeleton to an aggregator, wherein the aggregator is to generate a three-dimensional skeleton based on the whole skeleton and additional data.