US 12,335,540 B2
Image processing, network training and encoding methods, apparatus, device, and storage medium
Lu Yu, Dongguan (CN); Bingjie Zhu, Dongguan (CN); and Zhenyu Dai, Dongguan (CN)
Assigned to ZHEJIANG UNIVERSITY, Hangzhou (CN); and GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD., Dongguan (CN)
Filed by ZHEJIANG UNIVERSITY, Hangzhou (CN); and GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD., Dongguan (CN)
Filed on Oct. 17, 2023, as Appl. No. 18/380,987.
Application 18/380,987 is a continuation of application No. PCT/CN2021/098413, filed on Jun. 4, 2021.
Claims priority of application No. 202110415028.1 (CN), filed on Apr. 17, 2021.
Prior Publication US 2024/0064338 A1, Feb. 22, 2024
Int. Cl. H04N 19/91 (2014.01); H04N 19/124 (2014.01); H04N 19/184 (2014.01)
CPC H04N 19/91 (2014.11) [H04N 19/124 (2014.11); H04N 19/184 (2014.11)] 20 Claims
OG exemplary drawing
 
1. An image processing method, comprising:
receiving an encoded bitstream from a trained encoder;
decoding, by a trained decoder, the encoded bitstream to obtain a decoded reconstructed image; and
processing, by a trained task execution network, the decoded reconstructed image or the decoded reconstructed image subjected to image post-processing, to perform a machine learning task,
wherein the trained encoder and the trained decoder belong to a trained codec network, and a training process of the trained codec network and the trained task execution network comprises:
performing, based on a joint loss function of the codec network and the task execution network and a preset sample training set, joint training of the codec network and the task execution network until a value of the joint loss function meets a convergence condition, to obtain the trained codec network and the trained task execution network,
wherein the task execution network performs the machine learning task based on the decoded reconstructed image output from the codec network, and the joint loss function comprises a loss function of the task execution network and a function representing a bitrate of a feature bitstream of an input image of the codec network.