US 11,790,486 B2
Image processing method and apparatus
Junyong Noh, Daejeon (KR); Sanghun Park, Daejeon (KR); and Kwanggyoon Seo, Daejeon (KR)
Assigned to KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY, Daejeon (KR)
Filed by KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY, Daejeon (KR)
Filed on Aug. 12, 2021, as Appl. No. 17/444,919.
Prior Publication US 2022/0164921 A1, May 26, 2022
Int. Cl. G06K 9/00 (2022.01); G06T 3/40 (2006.01); G06T 5/50 (2006.01); G06N 3/088 (2023.01); G06N 3/045 (2023.01)
CPC G06T 3/4007 (2013.01) [G06N 3/045 (2023.01); G06N 3/088 (2013.01); G06T 3/4046 (2013.01); G06T 5/50 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 15 Claims
OG exemplary drawing
 
1. An image processing method, comprising:
for each of a plurality of input images, extracting a content latent code of an input image based on a content encoder configured to extract a feature of a content of an image; and
extracting a style latent code of the input image based on a style encoder configured to extract a feature of a style of an image;
obtaining a content feature vector by calculating a weighted sum of content latent codes extracted from the input images based on a morphing control parameter;
obtaining a style feature vector by calculating a weighted sum of style latent codes extracted from the input images based on the morphing control parameter; and
generating a morphing image based on the content feature vector, the style feature vector, and a decoder configured to generate an image from an embedding vector;
wherein a morphing generator including the content encoder, the style encoder and the decoder is trained based on at least one of:
an adversarial loss associated with a discrimination between an image comprised in a training data and an output morphing image obtained from the morphing generator; or
a pixel-wise reconstruction loss associated with a difference between pixels of the image comprised in the training data and pixels of the output morphing image,
wherein, the training data includes a first image, a second image, and a ground truth morphing image,
wherein the ground truth morphing image is generated by linearly interpolating a first point corresponding to the first image sampled in a latent space by a pretrained model and a second point corresponding to the second image sampled in the latent space by the pretrained model.