US 12,067,731 B2
Image foreground segmentation algorithm based on edge knowledge transformation
Zunlei Feng, Hangzhou (CN); Lechao Cheng, Hangzhou (CN); Jie Song, Hangzhou (CN); Li Sun, Hangzhou (CN); and Mingli Song, Hangzhou (CN)
Assigned to ZHEJIANG UNIVERSITY, Hangzhou (CN)
Filed by ZHEJIANG UNIVERSITY, Zhejiang (CN)
Filed on Jan. 28, 2022, as Appl. No. 17/586,806.
Application 17/586,806 is a continuation of application No. PCT/CN2021/101127, filed on Jun. 21, 2021.
Claims priority of application No. 202010794931.9 (CN), filed on Aug. 10, 2020.
Prior Publication US 2022/0148194 A1, May 12, 2022
Int. Cl. G06T 7/194 (2017.01); G06T 3/02 (2024.01); G06T 7/12 (2017.01); G06T 7/174 (2017.01)
CPC G06T 7/194 (2017.01) [G06T 3/02 (2024.01); G06T 7/12 (2017.01); G06T 7/174 (2017.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 3 Claims
OG exemplary drawing
 
1. An image foreground segmentation algorithm based on edge knowledge transformation, comprising the following steps:
1) construction of an image segmentation framework with edge self-supervised mechanism;
based on selection of a DeepLabV3+ network as a main segmentation network, converting an original image to obtain A*I by an affine transformation A for the original image I of a target category; inputting both the original image I and the converted image A*I into the main segmentation network to obtain corresponding predicted segmentation results F(I) and F(A*I), and transforming the segmentation result F(I) corresponding to the original image into A*F(I) by the same affine transformation A; obtaining corresponding edge masks m and m′ by subtracting a corrosion predicted segmentation result from an expansion predicted segmentation result for the transformed predicted segmentation result A*F(I) corresponding to the original image and the segmentation result F(A*I) corresponding to the converted image; constraining an edge segmentation result m*A*F(I) corresponding to the original image to be consistent with an edge segmentation result m′*F(A*I) corresponding to the affine-transformed image by using a L2 normal form, |m*A*F(I)−m′*F(A*I)|2, so that self-supervised information is formed to strengthen the segmentation consistency of the main segmentation network;
2) construction of an inner edge and outer edge discriminator;
in order to realize the transformation of edge knowledge, firstly constructing a binary outer edge discriminator Dout, which is a general binary convolution neural network; obtaining a foreground object by using a corresponding label mo for an input image I′ of a non-target category; using the outer edge discriminator Dout to judge whether an edge of the foreground object contain background features, and the outer edge discriminator judging a formed triplet {I′, mo, mo*I′} to be true; then constructing a binary inner edge discriminator Din, and obtaining a background part (1−mo)=I′ by using an inverted label 1−mo of the corresponding foreground object for the input image I′ of a non-target segmentation category; using the inner edge discriminator Din, to judge whether an edge of the background part contains foreground object features, and the inner edge discriminator Din judging a formed triplet {I′, 1−mo, (1−mo)*I′} to be true;
3) generation of pseudo-segmented triplet data;
in order to strengthen the ability of identifying whether the inner and outer edge discriminators Din and Dout contain features outside the edge of the object or features inside the edge of the object regarding the label mo and inverted label 1−mo corresponding to the image I′ of the non-target segmentation category in step 2), obtaining processed masks mo and 1−mo through an expansion operation Γ with a kernel radius r, forming an outer edge pseudo-segmented triplet {I′, mo, mo*I′} and an inner edge pseudo-segmented triplet {I′, 1−mo, (1−mo)*I′}, and constraining the inner and outer edge discriminators Din and Dout to discriminate the inner edge pseudo-triplet and outer edge pseudo-triplet to be false, so that the identification ability of the inner and outer edge discriminators is effectively strengthened;
4) inner and outer edge adversarial foreground segmentation guided based on labeled samples of the target category;
employing an amount of existing edge segmentation knowledge of open-source labeled image data to realize foreground segmentation of a target image, in order to realize foreground segmentation guided by labeled samples of the target category; realizing the training of a target category segmentation network by a supervised loss for the labeled image of the target category; obtaining real inner and outer edge triplets and pseudo-segmented triplets through step 2) and step 3) for open-source image of a non-target category; obtaining predicted segmentation results of the target image through the segmentation network to form a triplet of the predicted segmentation results; transforming the edge knowledge of the open-source labeled image into the target category image segmentation by using the adversary between the segmentation network and the inner and outer edge discriminators, and finally realizing the image foreground segmentation guided by target samples.