| CPC G06T 7/75 (2017.01) [G06T 7/292 (2017.01); G06V 40/23 (2022.01)] | 14 Claims |

|
1. A system comprising:
one or more processors; and
logic encoded in one or more non-transitory computer-readable storage media for execution by the one or more processors and when executed operable to cause the one or more processors to perform operations comprising:
obtaining a plurality of videos of a plurality of subjects in an environment, wherein at least one target subject of the plurality of subjects performs one or more actions in the environment;
tracking the at least one target subject across at least two cameras;
determining pose information associated with the at least one target subject, wherein the determining of pose information is based on triangulation;
reconstructing a 3-dimensional (3D) model of the at least one target subject based on the plurality of videos, the tracking of the at least one target subject, and the pose information;
determining back-projected pose information from the 3D model;
converting the back-projected pose information to 2D space information; and
recognizing the one or more actions of the at least one target subject based on the reconstructing of the 3D model and the 2D space information.
|