US 12,387,490 B2
	Methods, systems, and media for identifying human coactivity in images and videos using neural networks
Walid Mohamed Aly Ahmed, Mississauga (CA)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Filed by Walid Mohamed Aly Ahmed, Mississauga (CA)
Filed on Feb. 4, 2022, as Appl. No. 17/665,458.
Prior Publication US 2023/0252784 A1, Aug. 10, 2023
Int. Cl. G06V 20/40 (2022.01); G06T 7/73 (2017.01); G06T 9/00 (2006.01); G06V 10/82 (2022.01); G06V 40/10 (2022.01)

CPC G06V 20/41 (2022.01) [G06T 7/73 (2017.01); G06T 9/002 (2013.01); G06V 10/82 (2022.01); G06V 40/10 (2022.01); G06T 2200/04 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30196 (2013.01)]

20 Claims

1. A method, comprising:

for each image of one or more images:

obtaining a first key point position set and a second key point position set for the image;

the first key point position set including a key point position for each key point of a plurality of key points of a first human body detected in the image; and

the second key point position set including a key point position for each key point of a plurality of key points of a second human body detected in the image;

processing the first key point position set and the second key point position set to determine a distance between each key point position of the first key point position set and each key point position of the second key point position set, wherein the distance between each key point position of the first key point position set and all of the key point positions of the second key point position set is determined; and

processing the distances between each key point position of the first key point position set and each key point position of the second key point position set to generate an encoded representation of the image;

providing the encoded representation of each image of the one or more images to a coactivity classifier that includes a machine learned model that is configured to classify a coactivity of two human bodies; and

generating classification information, using the coactivity classifier, classifying a coactivity performed by the first human body and second human body in the one or more images.