CPC H04N 19/154 (2014.11) [G06N 3/08 (2013.01); G06N 20/10 (2019.01); G06T 3/4046 (2013.01); G06T 5/00 (2013.01); G06T 7/0002 (2013.01); G06T 7/254 (2017.01); G06T 9/002 (2013.01); H04N 19/107 (2014.11); H04N 19/124 (2014.11); H04N 19/172 (2014.11); H04N 19/174 (2014.11); H04N 19/176 (2014.11); H04N 19/567 (2014.11); H04N 21/23418 (2013.01); H04N 21/2343 (2013.01); H04N 21/236 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/20224 (2013.01)] | 21 Claims |
1. A computer readable non-transitory medium storing instructions for determining an objective video quality measure of a video predictive of subjective human quality ratings, the instructions for:
applying, at a video processing server comprising one or more processors and memory, a machine learning system trained to predict a subjective human quality rating for a first video, by:
determining a feature vector for the first video, by:
extracting a plurality of features from the first video using a feature extraction machine learning process;
generating a plurality of spatial features of the first video, at least one of the plurality of spatial features selected from among the plurality of features extracted using the feature extraction machine learning process based on analyzing one or more frames of the first video;
generating at least one temporal feature selected from among the plurality of features extracted using the feature extraction machine learning process based upon analyzing at least a portion of two or more frames of the first video; and
combining the generated plurality of spatial features and at least one temporal feature to form the feature vector of the first video; and
processing the feature vector through the trained machine learning system to obtain an aggregate quality measure of the first video, wherein the trained machine learning system for producing an aggregate quality measure for a given video is trained on a set of training videos and human-rated quality scores for each of the set of training videos, and wherein the training consists of iteratively training the machine learning system to generate aggregate quality measures predictive of the supplied human-rated quality scores, responsive to feature vectors of the training videos input to the machine learning system.
|