US 12,081,759 B2
	Image compression and decoding, video compression and decoding: methods and systems
Chri Besenbruch, London (GB); Ciro Cursio, London (GB); Christopher Finlay, London (GB); Vira Koshkina, London (GB); Alexander Lytchier, London (GB); Jan Xu, London (GB); and Arsalan Zafar, London (GB)
Assigned to DEEP RENDER LTD., London (GB)
Filed by DEEP RENDER LTD, London (GB)
Filed on Aug. 4, 2023, as Appl. No. 18/230,240.
Application 18/230,240 is a continuation of application No. 18/055,666, filed on Nov. 15, 2022.
Application 18/055,666 is a continuation of application No. 17/740,716, filed on May 10, 2022, granted, now 11,677,948.
Application 17/740,716 is a continuation of application No. PCT/GB2021/051041, filed on Apr. 29, 2021.
Claims priority of provisional application 63/053,807, filed on Jul. 20, 2020.
Claims priority of provisional application 63/017,295, filed on Apr. 29, 2020.
Claims priority of application No. 2006275.8 (GB), filed on Apr. 29, 2020; application No. 2008241.8 (GB), filed on Jun. 2, 2020; application No. 2011176.1 (GB), filed on Jul. 20, 2020; application No. 2012461.6 (GB), filed on Aug. 11, 2020; application No. 2012462.4 (GB), filed on Aug. 11, 2020; application No. 2012463.2 (GB), filed on Aug. 11, 2020; application No. 2012465.7 (GB), filed on Aug. 11, 2020; application No. 2012467.3 (GB), filed on Aug. 11, 2020; application No. 2012468.1 (GB), filed on Aug. 11, 2020; application No. 2012469.9 (GB), filed on Aug. 11, 2020; application No. 2016824.1 (GB), filed on Oct. 23, 2020; and application No. 2019531.9 (GB), filed on Dec. 10, 2020.
Prior Publication US 2023/0388499 A1, Nov. 30, 2023
Int. Cl. H04N 19/126 (2014.01); G06N 3/045 (2023.01); G06N 3/084 (2023.01); G06T 3/4046 (2024.01); G06T 9/00 (2006.01); G06V 10/774 (2022.01); H04N 19/13 (2014.01)

CPC H04N 19/126 (2014.11) [G06N 3/045 (2023.01); G06N 3/084 (2013.01); G06T 3/4046 (2013.01); G06T 9/002 (2013.01); G06V 10/774 (2022.01); H04N 19/13 (2014.11)]

16 Claims

1. A computer implemented method of training a first neural network and a second neural network, the neural networks being for use in lossy image or video compression, transmission and decoding, the method including the steps of:

(i) receiving an input training image;

(ii) encoding the input training image using the first neural network, to produce a latent representation;

(iii) quantizing the latent representation to produce a quantized latent;

(iv) using the second neural network to produce an output image from the quantized latent, wherein the output image is an approximation of the input training image;

(v) evaluating a loss function based on differences between the output image and the input training image;

(vi) evaluating a gradient of the loss function;

(vii) back-propagating the gradient of the loss function through the second neural network and through the first neural network, to update weights of the second neural network and of the first neural network; and

(viii) repeating steps (i) to (vii) using a set of training images, to produce a trained first neural network and a trained second neural network;

wherein the loss function comprises a rate loss term, a distortion loss term, and a mutual information loss term based on mutual information between the input training image and the output image, wherein the mutual information loss term models the input training image and noise as zero-mean independent Gaussians.