US 12,356,027 B2
Optimal format selection for video players based on predicted visual quality using machine learning
Yilin Wang, Mountain View, CA (US); Yue Guo, Chapel Hill, NC (US); and Balineedu Chowdary Adsumilli, Mountain View, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Appl. No. 17/790,102
Filed by GOOGLE LLC, Mountain View, CA (US)
PCT Filed Dec. 31, 2019, PCT No. PCT/US2019/069055
§ 371(c)(1), (2) Date Jun. 29, 2022,
PCT Pub. No. WO2021/137856, PCT Pub. Date Jul. 8, 2021.
Prior Publication US 2023/0054130 A1, Feb. 23, 2023
Int. Cl. H04N 21/23 (2011.01); H04N 19/154 (2014.01); H04N 19/40 (2014.01); H04N 21/2343 (2011.01)
CPC H04N 21/234309 (2013.01) [H04N 19/154 (2014.11); H04N 19/40 (2014.11)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
generating training data for a machine learning model to be trained to identify quality scores for a set of transcoded versions of a new video at a set of display resolutions, wherein each display resolution is a resolution at which a transcoded version is displayed after rescaling, and wherein generating the training data comprises:
generating a plurality of reference transcoded versions of a reference video, wherein the plurality of reference transcoded versions comprises the reference video transcoded into a plurality of video resolutions and rescaled to a plurality of display resolutions;
obtaining quality scores for frames of the plurality of reference transcoded versions of the reference video;
generating a first training input comprising a set of color attributes, spatial attributes, and temporal attributes of the frames of the reference video; and
generating a first target output for the first training input, wherein the first target output comprises the quality scores for the frames of the plurality of reference transcoded versions of the reference video; and
providing the training data to train the machine learning model on (i) a set of training inputs comprising the first training input and (ii) a set of target outputs comprising the first target output.