US 12,464,148 B2
	Computer-implemented multi-scale machine learning model for the enhancement of compressed video
Kiran Mukesh Misra, Camas, WA (US); Christopher Andrew Segall, Camas, WA (US); and Byeongdoo Choi, Irvine, CA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 17, 2023, as Appl. No. 18/186,084.
Claims priority of provisional application 63/437,957, filed on Jan. 9, 2023.
Prior Publication US 2024/0236345 A1, Jul. 11, 2024
Int. Cl. G06T 3/4053 (2024.01); H04N 19/117 (2014.01); H04N 19/12 (2014.01); H04N 19/136 (2014.01); H04N 19/139 (2014.01); H04N 19/172 (2014.01); H04N 19/176 (2014.01); H04N 19/42 (2014.01); H04N 19/59 (2014.01); H04N 19/60 (2014.01); H04N 19/82 (2014.01); H04N 19/91 (2014.01)

CPC H04N 19/42 (2014.11) [G06T 3/4053 (2013.01); H04N 19/117 (2014.11); H04N 19/12 (2014.11); H04N 19/136 (2014.11); H04N 19/139 (2014.11); H04N 19/172 (2014.11); H04N 19/176 (2014.11); H04N 19/59 (2014.11); H04N 19/60 (2014.11); H04N 19/82 (2014.11); H04N 19/91 (2014.11)]

20 Claims

1. A computer-implemented method comprising:

receiving a video at a content delivery service;

performing an encode on a frame of the video by the content delivery service that converts the frame from a pixel domain to a transform domain and back to the pixel domain to generate first pixel values and a first residual for a block of the frame at a first resolution;

generating a first set of features, by a machine learning model of the content delivery service, for an input, at the first resolution, of the first pixel values and the first residual of the block;

generating a second set of features, by the machine learning model of the content delivery service in parallel with the generating the first set of features, for an input, at a second lower resolution, of second pixel values and a second residual of the block;

upsampling the second set of features to the first resolution to generate an upsampled second set of features;

generating a modified version of the frame based on the first set of features and the upsampled second set of features; and

transmitting the modified version of the frame to a frame buffer or from the content delivery service to a viewer device.