US 12,299,974 B2
	Transmission line defect identification method based on saliency map and semantic-embedded feature pyramid
Qiang Yang, Hangzhou (CN); Chao Su, Hangzhou (CN); Yuan Cao, Hangzhou (CN); Di Jiang, Hangzhou (CN); Hao Xu, Hangzhou (CN); and Kaidi Qiu, Hangzhou (CN)
Assigned to Zhejiang University, Hangzhou (CN)
Filed by ZHEJIANG UNIVERSITY, Hangzhou (CN)
Filed on Dec. 14, 2022, as Appl. No. 18/081,368.
Prior Publication US 2023/0360390 A1, Nov. 9, 2023
Int. Cl. G06V 20/10 (2022.01); G06T 3/4053 (2024.01); G06T 5/20 (2006.01); G06V 10/46 (2022.01); G06V 10/764 (2022.01); G06V 10/77 (2022.01); G06V 10/774 (2022.01)

CPC G06V 20/176 (2022.01) [G06T 3/4053 (2013.01); G06T 5/20 (2013.01); G06V 10/464 (2022.01); G06V 10/764 (2022.01); G06V 10/7715 (2022.01); G06V 10/774 (2022.01); G06T 2207/20016 (2013.01); G06T 2207/20081 (2013.01)]

4 Claims

1. A transmission line defect identification method based on a saliency map and a semantic-embedded feature pyramid, the method comprising the following steps:

1) taking a target image of a transmission line as a dataset, labeling, based on whether the transmission line has a defect, the dataset as a normal set or a defect set, and classifying the dataset as a small target set or a non-small target set based on a size of the target image and a given threshold;

2) performing image super-resolution expansion on the small target set by using an Electric Line-Enhanced Super-Resolution Generative Adversarial Network (EL-ESRGAN) algorithm, combining the non-small target set and the small target set obtained after image super-resolution expansion, compressing a combined set based on a size of the small target set, and dividing the combined set into a training set and a test set;

3) generating the saliency map of an image in the training set by using a nested saliency detection network (U²-Net), ensuring integrity of a key region of a detection target by using a morphological expansion algorithm, generating a cutout region randomly for a part whose saliency score is less than a threshold, and padding a pixel randomly to form a data-augmented image set;

4) inputting a data-augmented image and its label into a deep semantic embedding (DSE)-based feature pyramid classification network to perform training to obtain a trained classifier; and

5) obtaining image data of an inspected target of the transmission line in real time, and taking the image data as an input of the trained classifier to output an identification result,

wherein performing the image super-resolution expansion on the small target further comprises:

defining loss functions of a generator and a discriminator of an EL-ESRGAN model, wherein formulas of the loss functions are as follows:

L_G^Ra=−E_x_{_r}[log(1−D_Ra(x_r,x_f))]−E_x_{_f}[log(D_Ra(x_f,x_r))]

L_D^Ra=−E_x_{_r}[log(D_Ra(x_r,x_f))]−E_x_{_f}[log(1−D_Ra(x_f,x_r))]

wherein L_G^Rarepresents a GAN loss function of the generator, L_D^Rarepresents a GAN loss function of the discriminator, D_Ra(x_r,x_f) represents a probability that an authenticated image is more real than a false image, D_Ra(x_f,x_r) represents a probability that the authenticated image is falser than a real image, E_x_{_f}[ ] represents an averaging operation performed on all false data in a processing batch, x_irepresents a low-resolution image input into a GAN, x_frepresents an authenticated image that is generated by the GAN and determined to be false, and x_rrepresents an authenticated image that is generated by the GAN and determined to be real;

training the generator of the EL-ESRGAN model by using the non-small target set of the transmission line to obtain a second-order degradation model, and using an L1 loss function, a perceptual loss function, and the GAN loss functions represented by L_G^Raand L_D^Ratogether to construct an overall loss function of the EL-ESRGAN, and performing training to obtain the EL-ESRGAN model; and

performing image super-resolution augmentation on the small target set of the transmission line by using the EL-ESRGAN model.