| CPC H04N 19/42 (2014.11) [H04N 19/12 (2014.11); H04N 19/139 (2014.11); H04N 19/172 (2014.11); H04N 19/176 (2014.11); H04N 19/60 (2014.11); H04N 19/85 (2014.11)] | 18 Claims |

|
1. A computer-implemented method for learned video compression, comprising:
processing a current frame (xt) and previously decoded frame (xt-1) of a video data using a motion estimation model to estimate a motion vector (vt) for every pixel;
compressing the motion vectors (vt) and reconstructing the motion vectors (vt) to reconstructed motion vectors (vt);
applying an enhanced context mining (ECM) model to obtain enhanced context (CE) from the reconstructed motion vectors (vt) and previously decoded frame feature (x̆t-1); wherein applying the ECM comprises:
obtaining the motion vectors (vt) and decoded frame feature (x̆t-1) based on the current input frame (xt) and previously decoded frame (xt-1);
warping the motion vectors (vt) and decoded frame feature (x̆t-1) to obtain a warped feature (xt); and
processing the warped feature (xt) using a resblock and convolution layer to obtain a context (
t);compressing the current frame (xt) with the assistance of the enhanced context (CE) to obtain a reconstructed frame (x′t); and
providing the reconstructed frame (x′t) to a post-enhancement backend network to obtain a high-resolution frame (xt).
|