| CPC G06T 7/292 (2017.01) [G06T 7/215 (2017.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01)] | 12 Claims |

|
1. A method for detecting and/or tracking moving objects within a certain zone, such as a ball and/or one or more players on a sports pitch comprising:
providing multiple physical cameras around the zone which are synchronised for, successively at regular instances after one another, taking simultaneously at every such instance a set of video frame images of the zone which form input video streams when put in a video sequence after one another, wherein the totality of video frame images of a set made at such an instance jointly cover at least the total area of the zone;
composing a sequence of panoramic views of the zone so to form a panoramic video stream by cutting away overlapping parts of the video frame images of each set and stitching together remaining parts of the video frame images of each set;
defining one or more virtual camera view(s) by selecting for each virtual camera view a corresponding partial or entire view of the panoramic views and by de-warping the selected, partial or entire views into square views or views with another geometry, which form a projection of a corresponding part of the zone;
feeding each of the square, de-warped views or de-warped views with another geometry to a deep learning neural network or an AI-network so to form a corresponding virtual detector; and,
performing a detection with the virtual detectors so to determine the presence or absence of objects in the corresponding part of the zone and possibly their type or class and their location in that part,
wherein a total number (N) bigger than one, of virtual camera views is chosen for covering the complete zone, and
wherein detection of objects in the entire zone is realised by feeding at each of the instances a restricted number (M), including one or a number (M) smaller than the total number (N), of virtual camera view leaning neural network or AI-network, hereby ensuring that after a certain number of instances the total number (N) of virtual camera views has been fed to the deep learning neural network or AI-network, so that detection is performed on the total zone, while the feed to the deep learning neural network of AI-network at every instance is reduced.
|