| CPC H04N 19/90 (2014.11) [H04N 19/172 (2014.11); H04N 19/186 (2014.11)] | 13 Claims |

|
1. A method for lossily compressing a sequence of video frames into a representation, wherein each video frame for the video frames includes pixels that carry color values, the method comprising the following steps:
segmenting each video frame into superpixels, wherein the superpixels are groups of pixels that share at least one predetermined common property;
assigning, to each superpixel in each video frame, at least one attribute derived from the pixels belonging to the respective superpixel;
combining superpixels as nodes in a graph representation, wherein:
superpixels in a same video frame are connected by spatial edges associated with at least one quantity that is a measure for a distance between the superpixels in the same video frame, and
in response to superpixels in adjacent video frames in the sequence meeting at least one predetermined relatedness criterion, the superpixels in the adjacent video frames are connected by temporal edges, wherein the relatedness criterion is a threshold value of a distance between the superpixels in adjacent video frames in the sequence, wherein the temporal edges connect the superpixels in the adjacent video frames when the relatedness criterion is met between the superpixels;
providing the graph representation to a graph neural network (GNN); and
obtaining, from the GNN, a processing result for the sequence of video frames.
|