| CPC G06V 10/7715 (2022.01) [G06V 10/255 (2022.01); G06V 10/30 (2022.01); G06V 10/776 (2022.01); G06V 10/82 (2022.01); G06V 20/17 (2022.01); G06V 20/40 (2022.01)] | 9 Claims |

|
1. A nighttime unmanned aerial vehicle (UAV) object tracking method fusing a hybrid attention mechanism, comprising: acquiring a night vision video sequence from a UAV, inputting a nighttime image frame of the night vision video sequence into a pre-trained night vision image enhancement model, to obtain a corresponding enhanced nighttime image frame, and performing image object tracking and recognition on the enhanced nighttime image frame, to obtain an object tracking and recognition result;
the night vision image enhancement model comprising an encoder module, a spatial hybrid attention module, a channel hybrid attention module, a decoder module, a curve projection module, and a denoising processing module;
the encoder module being configured to extract an initial convolutional feature map of the nighttime image frame;
the spatial hybrid attention module being configured to enhance the attention of a feature space dimension of the initial convolutional feature map, to form a spatial attention feature map of the nighttime image frame;
the channel hybrid attention module being configured to enhance the attention of a feature channel dimension of the spatial attention feature map, to form a hybrid attention feature map of the nighttime image frame;
the decoder module being configured to convert the hybrid attention feature map into a curve estimation parameter map;
the curve projection module being configured to map the curve estimation parameter map onto the nighttime image frame in a curve projection manner, to form an intermediate feature image of the nighttime image frame; and
the denoising processing module being configured to perform denoising processing on the intermediate feature image, to obtain the corresponding enhanced nighttime image frame;
the decoder module comprising four convolutional layers and four upsampling layers connected in series, for performing convolutional processing and upsampling deconvolution processing on the inputted hybrid attention feature map in sequence, followed by hyperbolic tangent conversion processing, to obtain the curve estimation parameter map; a processing procedure of the decoder module being expressed as:
Fde1=Up(Conv(FCHA)de1)de1;
Fde2=Up(Conv(Fde1)de2)de2;
Fde3=Up(Conv(Fde2)de3)de3;
Fde4=Up(Conv(Fde3)de4)de4;
Fde=tanh(Conv(Fde4)de);
where FCHA represents a hybrid attention feature map inputted into the decoder module, Conv(⋅)dei represents an operator of an ith convolutional layer in the decoder module, Up(⋅)dei represents an operator of an ith upsampling deconvolution layer in the decoder module, i=1,2,3,4, Fde1, Fde2, Fde3, and Fde4 represent intermediate operational outputs in the decoder module, Conv(⋅)de represents a convolutional operator during conversion processing in the decoder module, tanh(⋅) represents a hyperbolic tangent function tanh operation, and Fde represents a curve estimation parameter map outputted by the encoder module.
|