US 11,989,350 B2
	Hand key point recognition model training method, hand key point recognition method and device
Yang Yi, Shenzhen (CN); Shijie Zhao, Shenzhen (CN); Feng Li, Shenzhen (CN); and Xiaoxiang Zuo, Shenzhen (CN)
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed by TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed on Aug. 24, 2020, as Appl. No. 17/000,844.
Application 17/000,844 is a continuation of application No. PCT/CN2019/090542, filed on Jun. 10, 2019.
Claims priority of application No. 201810752953.1 (CN), filed on Jul. 10, 2018.
Prior Publication US 2020/0387698 A1, Dec. 10, 2020
Int. Cl. G06F 3/01 (2006.01); G06F 18/214 (2023.01); G06N 3/08 (2023.01); G06T 7/143 (2017.01); G06T 7/174 (2017.01); G06V 10/44 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 40/10 (2022.01); G06V 40/20 (2022.01)

CPC G06F 3/017 (2013.01) [G06F 18/214 (2023.01); G06N 3/08 (2013.01); G06T 7/143 (2017.01); G06T 7/174 (2017.01); G06V 10/454 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 40/107 (2022.01); G06V 40/28 (2022.01)]

20 Claims

1. A hand key-point recognition model training method for a model training device, comprising:

converting a sample virtual image into an emulation image by using a Cycle-GAN model, the sample virtual image being an image generated through three-dimensional modeling, the sample virtual image comprising key-point coordinates corresponding to hand key-points, and the emulation image being used for emulating an image acquired in a real scenario;

extracting a hand image in the emulation image; and

training a hand key-point recognition model according to the hand image in the emulation image and the key-point coordinates, the hand key-point recognition model being used for outputting hand key-point coordinates of a hand in a real image according to the inputted real image;

wherein the training the hand key-point recognition model according to the hand image in the emulation image and the key-point coordinates comprises:

constructing the hand key-point recognition model, the hand key-point recognition model comprising a two-dimensional recognition branch and a three-dimensional recognition branch, the two-dimensional recognition branch comprising a plurality of two-dimensional residual layers and a convolution layer, and the three-dimensional recognition branch comprising a plurality of three-dimensional residual layers and a fully connected layer;

calculating a two-dimensional recognition loss and a three-dimensional recognition loss of the hand key-point recognition model according to the hand image and the key-point coordinates; and

reversely training the hand key-point recognition model according to the two-dimensional recognition loss and the three-dimensional recognition loss.