CPC H04N 19/149 (2014.11) [H04N 19/126 (2014.11); H04N 19/172 (2014.11)] | 20 Claims |
1. A method performed by one or more data processing apparatus for encoding a video comprising a sequence of video frames to generate a respective encoded representation of each video frame, the method comprising, for one or more of the video frames:
obtaining a feature embedding for the video frame;
processing an input comprising the feature embedding for the video frame using a rate control machine learning model to generate a respective score for each of a plurality of possible quantization parameter values;
selecting a quantization parameter value from the plurality of possible quantization parameter values using the scores;
determining a cumulative amount of data required to represent: (i) an encoded representation of the video frame that is generated in accordance with a quantization step size associated with the selected quantization parameter value and (ii) encoded representations of each video frame that precedes the video frame;
determining, based on the cumulative amount of data, that a feedback control criterion for the video frame is satisfied;
updating the selected quantization parameter value in response to determining that the feedback control criterion is satisfied; and
processing the video frame using an encoding model, in accordance with a quantization step size associated with the selected quantization parameter value, to generate the encoded representation of the video frame.
|