CPC G06V 10/26 (2022.01) [G06T 7/0004 (2013.01); G06V 10/141 (2022.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30148 (2013.01)] | 5 Claims |
1. A multi-angle image semantic segmentation method for cadmium zinc telluride chips, comprising:
step 1) construction of an n+1 dataset:
acquiring single-camera multi-angle Cadmium Zinc Telluride (CZT) images I1, I2, . . . , In by using an acquisition system, manually selecting an image with the highest recognition of monocrystal and heterocrystal defect boundaries from n images as an image to be labeled, and marking a monocrystal area, a heterocrystal area and a background area of the CZT image by using labeling software to generate pixel-level semantic labels;
step 2) construction and training of a Progressive Complementary Knowledge Aggregation network (PCKA):
network structure
the PCKA comprising a Pixel Aggregation Network (PAN) and a Latent Aggregation Network (LAN), feeding the multi-angle images I1, I2, . . . , In in step 1) into the PAN to obtain pixel-level aggregated images Apixel, then feeding the pixel-level aggregated images Apixel and the multi-angle images I1, I2, . . . , In into the LAN respectively through a Feature Embedding Module (FEM) to obtain a defect semantic graph based on a general latent expression Alatent;
a specific data processing process of the PAN comprising:
feeding the multi-angle images I1, I2, . . . , In into a U-shaped weight calibrator to obtain corresponding weights w1, w2, . . . , wn, and expressing an encoder and a decoder in the calibrator respectively as PE and PDE, this process being expressed as:
O1=PE1(Cat(I1,I2, . . . ,In))
Oi=PEi(Oi-1),i=2, . . . ,5 (2)
where Cat(⋅) represents a splicing operation, PEi represents an ith encoding layer in the calibrator, and Oi represents an output of the ith encoding layer in the calibrator;
O′1=PDE1(O5)
O′i=PDEi(Cat(O′i-1,O5-(i-1))),i=2, . . . ,5 (3)
where PDEi represents an ith decoding layer in the PAN and O′i represents an output of the ith decoding layer in the PAN;
finally obtaining n weights w1, w2, . . . , wn corresponding to I1, I2, . . . , In, then obtaining calibrated images I′i (i=1, . . . , n) according to Ii*wi, and finally obtaining pixel-level aggregated images Apixel through
![]() the LAN comprising Forward Extraction Modules (FEMs), Augmentation Guide Modules (AGMs) and a Deep Projection Module (DPM), the FEMs being configured to obtain feature expressions of I1, I2, . . . , In, Apixel, the AGMs being configured to adaptively extract latent clues under the guide of Apixel, and the DPM being configured to project aggregated features to a deeper feature space to obtain the general latent expression Alatent to finally obtain a semantic graph based on Alatent;
network training
inputting the acquired single-camera multi-angle CZT images into the PCKA for training, n multi-angle CZT images being used as an input of a training dataset of a system, one labeled image and its labeled data thereof being used as network output true values.
|