US 12,175,721 B1
Multi-angle image semantic segmentation method for cadmium zinc telluride chips
Peihao Li, Beijing (CN); Huihui Bai, Beijing (CN); Yunchao Wei, Beijing (CN); Yao Zhao, Beijing (CN); Anhong Wang, Beijing (CN); and Jiapeng Jia, Beijing (CN)
Assigned to Beijing Jiaotong University, Beijing (CN); Taiyuan University of Science and Technology, Taiyuan (CN); and Shanxi Zhishi Haotai Technology Co., LTD, Taiyuan (CN)
Filed by Beijing Jiaotong University, Beijing (CN); Taiyuan University of Science and Technology, Taiyuan (CN); and Shanxi Zhishi Haotai Technology Co., LTD, Taiyuan (CN)
Filed on Aug. 27, 2024, as Appl. No. 18/816,849.
Claims priority of application No. 202311112314.6 (CN), filed on Aug. 31, 2023.
Int. Cl. G06V 10/00 (2022.01); G06T 7/00 (2017.01); G06V 10/141 (2022.01); G06V 10/26 (2022.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01)
CPC G06V 10/26 (2022.01) [G06T 7/0004 (2013.01); G06V 10/141 (2022.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30148 (2013.01)] 5 Claims
OG exemplary drawing
 
1. A multi-angle image semantic segmentation method for cadmium zinc telluride chips, comprising:
step 1) construction of an n+1 dataset:
acquiring single-camera multi-angle Cadmium Zinc Telluride (CZT) images I1, I2, . . . , In by using an acquisition system, manually selecting an image with the highest recognition of monocrystal and heterocrystal defect boundaries from n images as an image to be labeled, and marking a monocrystal area, a heterocrystal area and a background area of the CZT image by using labeling software to generate pixel-level semantic labels;
step 2) construction and training of a Progressive Complementary Knowledge Aggregation network (PCKA):
network structure
the PCKA comprising a Pixel Aggregation Network (PAN) and a Latent Aggregation Network (LAN), feeding the multi-angle images I1, I2, . . . , In in step 1) into the PAN to obtain pixel-level aggregated images Apixel, then feeding the pixel-level aggregated images Apixel and the multi-angle images I1, I2, . . . , In into the LAN respectively through a Feature Embedding Module (FEM) to obtain a defect semantic graph based on a general latent expression Alatent;
a specific data processing process of the PAN comprising:
feeding the multi-angle images I1, I2, . . . , In into a U-shaped weight calibrator to obtain corresponding weights w1, w2, . . . , wn, and expressing an encoder and a decoder in the calibrator respectively as PE and PDE, this process being expressed as:
O1=PE1(Cat(I1,I2, . . . ,In))
Oi=PEi(Oi-1),i=2, . . . ,5  (2)
where Cat(⋅) represents a splicing operation, PEi represents an ith encoding layer in the calibrator, and Oi represents an output of the ith encoding layer in the calibrator;
O′1=PDE1(O5)
O′i=PDEi(Cat(O′i-1,O5-(i-1))),i=2, . . . ,5  (3)
where PDEi represents an ith decoding layer in the PAN and O′i represents an output of the ith decoding layer in the PAN;
finally obtaining n weights w1, w2, . . . , wn corresponding to I1, I2, . . . , In, then obtaining calibrated images I′i (i=1, . . . , n) according to Ii*wi, and finally obtaining pixel-level aggregated images Apixel through

OG Complex Work Unit Math
the LAN comprising Forward Extraction Modules (FEMs), Augmentation Guide Modules (AGMs) and a Deep Projection Module (DPM), the FEMs being configured to obtain feature expressions of I1, I2, . . . , In, Apixel, the AGMs being configured to adaptively extract latent clues under the guide of Apixel, and the DPM being configured to project aggregated features to a deeper feature space to obtain the general latent expression Alatent to finally obtain a semantic graph based on Alatent;
network training
inputting the acquired single-camera multi-angle CZT images into the PCKA for training, n multi-angle CZT images being used as an input of a training dataset of a system, one labeled image and its labeled data thereof being used as network output true values.