| CPC G06V 40/11 (2022.01) [G06F 3/0346 (2013.01); G06F 3/0488 (2013.01); G06V 10/25 (2022.01); G06V 10/803 (2022.01); G06V 20/20 (2022.01); G06V 20/64 (2022.01)] | 9 Claims |

|
1. A system of tracking an input sign for extended reality, comprising:
an output device;
an image capture device; and
a processor, coupled to the output device and the image capture device, wherein the processor comprises:
a first circuit, obtains an image through the image capture device;
a second circuit, detects for a handheld device and a hand in the image;
a third circuit, in response to both a first bounding box of the hand and a second bounding box of the handheld device being detected and the first bounding box being overlapped with the second bounding box, the third circuit detects at least one joint of the hand from the image;
a fourth circuit, receives a signal corresponding to a user input received by a touch screen of the handheld device, in response to a number of joints of the at least one joint being greater than a threshold, performs a data fusion of the first bounding box, the second bounding box, and the signal according to a first weight of the first bounding box to obtain the input sign, wherein the input sign is presented by at least one of the hand and the handheld device; and
a fifth circuit, outputs a command corresponding to the input sign via the output device.
|