CPC H04N 19/147 (2014.11) [G06N 3/08 (2013.01); G06T 3/4046 (2013.01); G06T 9/002 (2013.01); H04N 19/132 (2014.11); H04N 19/184 (2014.11)] | 18 Claims |
1. A video processing system comprising:
an upsampler;
a video codec;
a trained machine learning (ML) model-based video downsampler trained using a neural network-based (NN-based) proxy video codec; and
a processing hardware configured to:
receive an input video sequence having a first display resolution;
extract a content sample of the input video sequence;
map, using the trained ML model-based video downsampler, the content sample to a lower resolution sample;
transform, using one of the video codec or the NN-based proxy video codec, the lower resolution sample into a decoded sample bitstream;
predict, using the upsampler and the decoded sample bitstream, an output sample corresponding to the content sample; and
modify, based on the predicted output sample, one or more parameters of the trained ML model-based video downsampler;
wherein the ML model-based video downsampler is trained using the input video sequence, the output sample, and an objective function based on an estimated rate of the lower resolution sample and a plurality of perceptual loss functions.
|