US 11,887,396 B2
	Method for identifying a hand pose in a vehicle
Hisharn Cholakkal, Kerala (IN); Sanath Narayan, Bangalore (IN); Arjun Jain, Bangalore (IN); Shuaib Ahmed, Bangalore (IN); Amit Bhatkal, Dandeli (IN); Mallikarjun Byrasandra Ramalinga Reddy, Bangalore (IN); and Apurbaa Mallik, Bangalore (IN)
Assigned to MERCEDES-BENZ GROUP AG, Stuttgart (DE)
Appl. No. 17/273,521
Filed by DAIMLER AG, Stuttgart (DE)
PCT Filed Aug. 27, 2019, PCT No. PCT/EP2019/072747 § 371(c)(1), (2) Date Mar. 4, 2021, PCT Pub. No. WO2020/048814, PCT Pub. Date Mar. 12, 2020.
Claims priority of application No. 201841033282 (IN), filed on Sep. 5, 2018.
Prior Publication US 2021/0342579 A1, Nov. 4, 2021
Int. Cl. G06V 40/10 (2022.01); G06T 7/73 (2017.01); G06N 3/08 (2023.01); G06T 3/40 (2006.01); G06V 40/20 (2022.01); G06V 20/59 (2022.01); G06F 18/2413 (2023.01)

CPC G06V 40/113 (2022.01) [G06F 18/2413 (2023.01); G06N 3/08 (2013.01); G06T 3/40 (2013.01); G06T 7/74 (2017.01); G06V 20/59 (2022.01); G06V 40/107 (2022.01); G06V 40/28 (2022.01); G06T 2207/20084 (2013.01); G06T 2207/20132 (2013.01); G06T 2207/30268 (2013.01)]

12 Claims

1. A computerized method, comprising:

extracting a hand image of a hand in a vehicle image of a vehicle using a single point associated with the hand, wherein a size of the extracted hand image is fixed and predetermined, and wherein the single point is at a fixed position within the extracted hand image of a fixed and predetermined size;

obtaining a plurality of contextual images of the hand image based on the single point, wherein each of the plurality of contextual images is obtained by

selecting the single point at the fixed position in the hand image; and

cropping the hand image to a corresponding predefined size based on the single point as a center point;

processing each of the plurality of contextual images using a predefined number of layers of a neural network to obtain a plurality of contextual features associated with the hand image; and

identifying, using a classifier model, a hand pose associated with the hand based on the plurality of contextual features.