US 12,137,230 B2
	Method and apparatus for applying deep learning techniques in video coding, restoration and video quality analysis (VQA)
Pankaj N. Topiwala, Cocoa Beach, FL (US); Madhu Peringassery Krishnan, Columbia, MD (US); and Wei Dai, Clarksville, MD (US)
Assigned to FASTVDO LLC, Melbourne, FL (US)
Filed by FastVDO LLC, Melbourne, FL (US)
Filed on Apr. 15, 2022, as Appl. No. 17/722,257.
Application 17/722,257 is a continuation of application No. 17/119,981, filed on Dec. 11, 2020, granted, now 11,310,509.
Application 17/119,981 is a continuation of application No. 16/508,198, filed on Jul. 10, 2019, granted, now 10,880,551.
Claims priority of provisional application 62/764,801, filed on Aug. 16, 2018.
Claims priority of provisional application 62/696,285, filed on Jul. 10, 2018.
Prior Publication US 2022/0239925 A1, Jul. 28, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. H04N 19/154 (2014.01); G06N 3/08 (2023.01); G06N 20/10 (2019.01); G06T 3/4046 (2024.01); G06T 5/00 (2024.01); G06T 7/00 (2017.01); G06T 7/254 (2017.01); G06T 9/00 (2006.01); H04N 19/107 (2014.01); H04N 19/124 (2014.01); H04N 19/172 (2014.01); H04N 19/174 (2014.01); H04N 19/176 (2014.01); H04N 19/567 (2014.01); H04N 21/234 (2011.01); H04N 21/2343 (2011.01); H04N 21/236 (2011.01)

CPC H04N 19/154 (2014.11) [G06N 3/08 (2013.01); G06N 20/10 (2019.01); G06T 3/4046 (2013.01); G06T 5/00 (2013.01); G06T 7/0002 (2013.01); G06T 7/254 (2017.01); G06T 9/002 (2013.01); H04N 19/107 (2014.11); H04N 19/124 (2014.11); H04N 19/172 (2014.11); H04N 19/174 (2014.11); H04N 19/176 (2014.11); H04N 19/567 (2014.11); H04N 21/23418 (2013.01); H04N 21/2343 (2013.01); H04N 21/236 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/20224 (2013.01)]

21 Claims

1. A computer readable non-transitory medium storing instructions for determining an objective video quality measure of a video predictive of subjective human quality ratings, the instructions for:

applying, at a video processing server comprising one or more processors and memory, a machine learning system trained to predict a subjective human quality rating for a first video, by:

determining a feature vector for the first video, by:

extracting a plurality of features from the first video using a feature extraction machine learning process;

generating a plurality of spatial features of the first video, at least one of the plurality of spatial features selected from among the plurality of features extracted using the feature extraction machine learning process based on analyzing one or more frames of the first video;

generating at least one temporal feature selected from among the plurality of features extracted using the feature extraction machine learning process based upon analyzing at least a portion of two or more frames of the first video; and

combining the generated plurality of spatial features and at least one temporal feature to form the feature vector of the first video; and

processing the feature vector through the trained machine learning system to obtain an aggregate quality measure of the first video, wherein the trained machine learning system for producing an aggregate quality measure for a given video is trained on a set of training videos and human-rated quality scores for each of the set of training videos, and wherein the training consists of iteratively training the machine learning system to generate aggregate quality measures predictive of the supplied human-rated quality scores, responsive to feature vectors of the training videos input to the machine learning system.