| CPC G06V 10/26 (2022.01) [G06V 10/40 (2022.01); G06V 10/776 (2022.01); G06V 10/806 (2022.01); G06V 10/82 (2022.01)] | 7 Claims |

|
1. A computer-implemented method for performing image semantic segmentation in complex real-world environments based on multi-channel deep weighted aggregation, comprising:
executing, by one or more processors, instructions stored on a non-transitory computer-readable storage medium, wherein the instructions cause the processor(s) to performing follow operations:
S1, semantic features with definite class information in an image, transition semantic features between low-level semantic and high-level semantic, and semantic features of context logic relationship in the image are extracted by a low-level semantic channel, an auxiliary semantic channel and a high-level semantic channel, respectively;
S2, three different semantic features obtained in S1 are fused by weighted aggregation to obtain global semantic information of the image;
S3, the semantic features output from respective semantic channels in S1 and the global semantic information in S2 are used to compute loss function for training, wherein, in S1:
a shallow convolution structure network is used to construct the low-level semantic channel for extracting low-level semantic information, a depthwise separable convolution structure network is used to construct an auxiliary semantic channel, and transition semantic information obtained from the auxiliary semantic channel is fed back to the high-level semantic channel;
a deep convolution structure network is used to construct the high-level semantic channel for extracting high-level semantic information; and a process of extracting the low-level semantic information by the shallow convolution structure network includes:
LS(IH*W))=S3(S2(IH*W)));
wherein, LS(IH*W) is a extraction process of the low-level semantic information, IH*W is input image array, and S is a convolution stride.
|