US 12,206,851 B2
Implicit image and video compression using machine learning systems
Yunfan Zhang, Amsterdam (NL); Ties Jehan Van Rozendaal, Amsterdam (NL); Taco Sebastiaan Cohen, Amsterdam (NL); Markus Nagel, Amsterdam (NL); and Johann Hinrich Brehmer, Amsterdam (NL)
Assigned to QUALCOMM INCORPORATED, San Diego, CA (US)
Filed by QUALCOMM Incorporated, San Diego, CA (US)
Filed on Dec. 17, 2021, as Appl. No. 17/645,018.
Claims priority of provisional application 63/191,606, filed on May 21, 2021.
Prior Publication US 2022/0385907 A1, Dec. 1, 2022
Int. Cl. H04N 19/126 (2014.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01); G06T 9/00 (2006.01); H04N 19/147 (2014.01); H04N 19/91 (2014.01)
CPC H04N 19/126 (2014.11) [G06N 3/045 (2023.01); G06N 3/08 (2013.01); G06T 9/002 (2013.01); H04N 19/147 (2014.11); H04N 19/91 (2014.11)] 44 Claims
OG exemplary drawing
 
1. A method of processing media data, comprising:
receiving a plurality of images for compression by a neural network compression system;
processing, using initialized weight values for weights of a first model of the neural network compression system, a coordinate grid to generate reconstructed output values for a first image from the plurality of images, wherein the coordinate grid is separate from the first image;
comparing values of the first image to the reconstructed output values for the first image to determine a loss for the reconstructed output values;
tuning, using backpropagation to reduce the determined loss, the weights of the first model to generate a first plurality of tuned weight values for the weights of the first model for reconstructing the first image;
generating a first bitstream comprising a compressed version of the first plurality of tuned weight values; and
outputting the first bitstream for transmission to a receiver for reconstructing the first image.