US 12,412,421 B2
Assignment of unique identifications to people in multi-camera field of view
Raghavendra Balavalikar Krishnamurthy, Austin, TX (US); Rajen Bhatt, Pittsburgh, PA (US); Kui Zhang, Austin, TX (US); and David A. Bryan, Austin, TX (US)
Assigned to Hewlett-Packard Development Company, L.P., Spring, TX (US)
Filed by Hewlett-Packard Development Company, L.P., Spring, TX (US)
Filed on Feb. 5, 2023, as Appl. No. 17/971,243.
Prior Publication US 2024/0135748 A1, Apr. 25, 2024
Int. Cl. G06K 9/00 (2022.01); G06T 7/70 (2017.01); G06V 40/16 (2022.01)
CPC G06V 40/172 (2022.01) [G06T 7/70 (2017.01); G06T 2207/20084 (2013.01); G06T 2207/30242 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for identifying meeting participants in a multi-camera video conference room, comprising:
generating a plurality of input frame images taken from different perspectives of a video conference room by a corresponding plurality of cameras connected together;
detecting, from an input frame image associated with each camera, one or more human heads for any meeting participants captured in the input frame image by applying a machine learning human head detector model to said input image frame;
generating, from each detected human head, a head bounding box which surrounds the detected human head;
extracting, from each head bounding box, a body bounding box which surrounds the detected human head and at least an upper body portion of a meeting participant belonging to the detected human head, thereby generating a plurality of body bounding boxes from the plurality of input frame images;
generating, from each input frame image portion contained within the body bounding box, a participant identification feature embedding which uniquely identifies the meeting participant captured in the body bounding box, thereby generating a plurality of participant identification feature embeddings from the plurality of body bounding boxes; and
performing person re-identification processing on the plurality of participant identification feature embeddings to determine a count of the meeting participants in the video conference room,
wherein performing person re-identification processing comprises:
dividing the plurality of participant identification feature embeddings into a query set and a gallery set, and
comparing the query set to the gallery set to identify k top feature embedding matches so that matching feature embeddings are assigned to the same meeting participant,
wherein the query set contains participant identification feature embeddings extracted from body bounding boxes generated from a first input frame captured at a primary camera, and
wherein the gallery set contains participant identification feature embeddings extracted from body bounding boxes generated from one or more additional input frames captured at one or more secondary cameras.