US 12,327,385 B2
End-to-end deep generative network for low bitrate image coding
Yifei Pei, Santa Clara, CA (US); Ying Liu, Santa Clara, CA (US); Nam Ling, Santa Clara, CA (US); Yongxiong Ren, San Jose, CA (US); and Lingzhi Liu, San Jose, CA (US)
Assigned to SANTA CLARA UNIVERSITY, Santa Clara, CA (US); and KWAI INC., Palo Alto, CA (US)
Filed by SANTA CLARA UNIVERSITY, Santa Clara, CA (US); and KWAI INC., Palo Alto, CA (US)
Filed on Oct. 19, 2022, as Appl. No. 17/969,551.
Prior Publication US 2024/0185473 A1, Jun. 6, 2024
Int. Cl. G06T 9/00 (2006.01); G06N 3/0455 (2023.01); G06N 3/0475 (2023.01); G06N 3/094 (2023.01); H04N 19/124 (2014.01)
CPC G06T 9/002 (2013.01) [G06N 3/0455 (2023.01); G06N 3/0475 (2023.01); G06N 3/094 (2023.01); H04N 19/124 (2014.11)] 18 Claims
OG exemplary drawing
 
1. A neural network system implemented by one or more computers for compressing an image, comprising:
a generator comprising an encoder, an entropy estimator, and a decoder,
wherein the encoder receives an input image and generates an encoder output, a plurality of quantized feature entries are obtained based on the encoder output outputted at a last encoder block in the encoder, the entropy estimator receives the plurality of quantized feature entries and calculates an entropy loss based on the plurality of quantized feature entries, and the decoder receives the plurality of quantized feature entries and generates a reconstructed image; and
a discriminator that determines whether the reconstructed image is different from the input image based on a discriminator loss,
wherein a generator loss comprises the entropy loss and a combined content loss, and the combined content loss comprises a mean absolute error (MAE) and a multi-scale structural similarity index measure (MS-SSIM) between the reconstructed image and the input image, and
wherein the generator is configured to determine whether content of the reconstructed image matches content of the input image based on the generator loss.