US 12,406,343 B1
Image inpainting method and system guided by image structure and texture information
Gang Yang, Nanchang (CN); Lizhen Dai, Nanchang (CN); Hailong Yang, Nanchang (CN); Jie Sheng, Nanchang (CN); Hui Yang, Nanchang (CN); Rongxiu Lu, Nanchang (CN); and Fangping Xu, Nanchang (CN)
Assigned to East China Jiaotong University, Nanchang (CN)
Filed by East China Jiaotong University, Nanchang (CN)
Filed on Jan. 17, 2025, as Appl. No. 19/027,120.
Claims priority of application No. 202411203742.4 (CN), filed on Aug. 30, 2024.
Int. Cl. G06T 5/77 (2024.01); G06N 3/0455 (2023.01); G06N 3/0475 (2023.01); G06T 5/60 (2024.01); G06T 7/13 (2017.01)
CPC G06T 5/77 (2024.01) [G06N 3/0455 (2023.01); G06N 3/0475 (2023.01); G06T 5/60 (2024.01); G06T 7/13 (2017.01); G06T 2207/20016 (2013.01); G06T 2207/20084 (2013.01)] 13 Claims
OG exemplary drawing
 
1. An image inpainting method guided by an image structure and texture information, comprising:
constructing an image inpainting model with an interaction ability between the image structure and the texture information, wherein a structure of the image inpainting model comprises a generator and a discriminator, the generator comprises a structure and texture interaction module (STIM), a gated interaction unit (GIU), and a multi-view local reconstruction network (MLRN), the STIM comprises a texture information encoder upper branch, a structure information encoder lower branch, and an intermediate encoder, output results of the texture information encoder upper branch and the structure information encoder lower branch each have a remapping relationship with an output result of the intermediate encoder, and an expression of remapping is as follows:

OG Complex Work Unit Math
θ(□) is a mapping function, (MTn,MSn) is an input eigenvalue, lg(•) is a logarithmic function, γ is a learnable superparameter, μ is a set minimal value, MTn is a first mask feature segment, MSn is a second mask feature segment, and X is a variable;
wherein the damaged image sample, the mask image sample, and the edge structure image sample are used as an input into the STIM, and the STIM outputs a feature result, specifically comprising:
inputting the mask image sample into the intermediate encoder, and outputting, by the intermediate encoder, the first mask feature segment and the second mask feature segment after a convolution operation, a normalization operation and a Sigmoid activation function operation;
jointly inputting a preprocessed image sample combination into the texture information encoder upper branch, outputting, by the texture information encoder upper branch, a texture feature corresponding to the image sample combination, adding the texture feature to a remapped first mask feature segment in a channel dimension after performing a convolution operation, a normalization operation, and a LeakyReLU activation function operation on the texture feature, and then multiplying a result thereof in the channel dimension with a structure feature output by the structure information encoder lower branch, to obtain a first feature result, wherein the image sample combination comprises the damaged image sample, the mask image sample, and the edge structure image sample; and
jointly inputting the preprocessed image sample combination into the structure information encoder lower branch, outputting, by the structure information encoder lower branch, the structure feature corresponding to the image sample combination, adding the structure feature to a remapped second mask feature segment in the channel dimension after performing a convolution operation, a normalization operation, and a LeakyReLU activation function operation on the structure feature, and then multiplying a result thereof in the channel dimension with the texture feature output by the texture information encoder upper branch, to obtain a second feature result;
preprocessing a damaged image sample, to obtain a mask image sample and an edge structure image sample;
inputting the damaged image sample, the mask image sample, and the edge structure image sample into the image inpainting model, training the image inpainting model, and generating weight information of the image inpainting model, to obtain a target image inpainting model; and
obtaining a damaged image to be restored, preprocessing the damaged image to be restored, to obtain a mask image and an edge structure image, separately inputting the damaged image to be restored, the mask image, and the edge structure image into the target image inpainting model, and outputting, by the target image inpainting model, a restored image.