| CPC G06V 20/41 (2022.01) [G06T 7/73 (2017.01); G06T 9/002 (2013.01); G06V 10/82 (2022.01); G06V 40/10 (2022.01); G06T 2200/04 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30196 (2013.01)] | 20 Claims |

|
1. A method, comprising:
for each image of one or more images:
obtaining a first key point position set and a second key point position set for the image;
the first key point position set including a key point position for each key point of a plurality of key points of a first human body detected in the image; and
the second key point position set including a key point position for each key point of a plurality of key points of a second human body detected in the image;
processing the first key point position set and the second key point position set to determine a distance between each key point position of the first key point position set and each key point position of the second key point position set, wherein the distance between each key point position of the first key point position set and all of the key point positions of the second key point position set is determined; and
processing the distances between each key point position of the first key point position set and each key point position of the second key point position set to generate an encoded representation of the image;
providing the encoded representation of each image of the one or more images to a coactivity classifier that includes a machine learned model that is configured to classify a coactivity of two human bodies; and
generating classification information, using the coactivity classifier, classifying a coactivity performed by the first human body and second human body in the one or more images.
|