CPC G06V 20/52 (2022.01) [G06F 18/2155 (2023.01); G06N 3/045 (2023.01); G06T 7/20 (2013.01); G06V 10/25 (2022.01); G06V 10/44 (2022.01); G06T 2207/30241 (2013.01)] | 11 Claims |
1. A system for tracking a target object across a plurality of image frames, comprising:
a logic machine; and
a storage machine holding instructions executable by the logic machine to:
calculate a trajectory for the target object over one or more previous frames occurring before a target frame, wherein the target object is detected by tracking, in a similarity matrix, comparison values indicating similarity between object feature data for a first set of objects detected in a first previous frame, the first set of objects including the target object, and object feature data for a second set of objects in a second previous frame, wherein the similarity matrix includes:
a row for each object in a union of both of the first set of objects and the second set of objects; and
a column for each object in the union,
wherein each matrix element of the similarity matrix represents one comparison value between a pair of objects drawn from the union;
responsive to assessing no detection of the target object in the target frame:
upon determining that the target object is not detected in the target frame due to being occluded by a set of one or more other objects, predict an estimated region for the target object based on the trajectory;
predict an occlusion center based on a set of candidate occluding locations for the set of other objects within a threshold distance of the estimated region, each location of the set of candidate occluding locations overlapping with the estimated region; and
automatically estimate a bounding box for the target object in the target frame based on the occlusion center, wherein the bounding box is estimated via a trained machine learning system trained via supervised learning with image data and ground-truth bounding boxes.
|