US 12,217,496 B2
Hand gesture detection method involves acquiring initial depth image using backbone and apparatus, and non-transitory computer-readable storage medium
Yang Zhou, Palo Alto, CA (US); and Jie Liu, Palo Alto, CA (US)
Assigned to GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD., Dongguan (CN)
Filed by GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD., Dongguan (CN)
Filed on May 19, 2022, as Appl. No. 17/748,907.
Application 17/748,907 is a continuation of application No. PCT/CN2020/129258, filed on Nov. 17, 2020.
Claims priority of provisional application 62/938,176, filed on Nov. 20, 2019.
Prior Publication US 2022/0277595 A1, Sep. 1, 2022
Int. Cl. G06V 10/25 (2022.01); G06T 7/50 (2017.01); G06T 7/73 (2017.01); G06V 10/82 (2022.01); G06V 40/10 (2022.01); G06V 40/20 (2022.01); G06V 10/80 (2022.01)
CPC G06V 10/82 (2022.01) [G06T 7/50 (2017.01); G06T 7/75 (2017.01); G06V 40/10 (2022.01); G06V 40/28 (2022.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20132 (2013.01); G06V 10/803 (2022.01)] 19 Claims
OG exemplary drawing
 
1. A hand gesture detection method, comprising:
obtaining an initial depth image comprising a hand to be detected, and performing detection processing on the initial depth image by using a backbone feature extractor and a bounding box detection model, to obtain initial bounding boxes and a first feature map corresponding to the hand to be detected;
determining a target bounding box based on the initial bounding boxes, the target bounding box being one of the initial bounding boxes;
cropping, based on the target bounding box, the first feature map by using an RoIAlign feature extractor, to obtain a second feature map corresponding to the hand to be detected; and
performing, based on the second feature map, a three-dimensional gesture estimation processing on the hand to be detected by using a gesture estimation model to obtain a gesture detection result of the hand to be detected.