US 11,989,935 B1
Activity recognition method of LRF large-kernel attention convolution network based on large receptive field
Guangwei Hu, Jiangsu (CN); Qi Teng, Jiangsu (CN); Lei Pei, Jiangsu (CN); Wenwen Pan, Jiangsu (CN); Qi Huang, Jiangsu (CN); Qianyou Zhang, Jiangsu (CN); Cheng Luo, Jiangsu (CN); and Yun Liu, Jiangsu (CN)
Assigned to NANJING UNIVERSITY, Nanjing (CN); and NANJING MORENEW DATA CO. LTD., Nanjing (CN)
Filed by NANJING UNIVERSITY, Jiangsu (CN); and NANJING MORENEW DATA Co. Ltd., Jiangsu (CN)
Filed on Dec. 21, 2023, as Appl. No. 18/392,484.
Claims priority of application No. 202211695992.5 (CN), filed on Dec. 28, 2022.
Int. Cl. G06V 10/82 (2022.01); G06V 10/764 (2022.01)
CPC G06V 10/82 (2022.01) [G06V 10/764 (2022.01)] 6 Claims
OG exemplary drawing
 
1. An activity recognition method of an LRF (large receptive field) large kernel attention convolution network based on a large receptive field, comprising:
collecting an action signal, carrying out a preprocessing and a data partition on the action signal to obtain a data set; and
training an LRF large-kernel attention convolution network model based on the data set, and introducing a trained LRF large-kernel attention convolution network model into a mobile wearable recognition device for a human posture recognition;
wherein the LRF large-kernel attention convolution network model comprises:
a n LRF large-kernel attention convolution network with three layers and a fully connected classification output layer, wherein the LRF large-kernel attention convolution network comprises a local depth convolution layer, a long-distance depth expansion convolution layer and a 1×1 ordinary convolution layer for a feature extraction; and the fully connected classification output layer is used for an action classification;
a calculation method of the LRF large-kernel attention convolution network model comprises:

OG Complex Work Unit Math
wherein X represents an input matrix, t represents a time step of the input matrix, and s represents a sensor mode the input matrix;
compressing the input matrix X into one-dimensional data, and introducing into a self-attention module, outputting a weighted sum of all value vectors, and using a Softmax function for a normalization;

OG Complex Work Unit Math
wherein Q, K, and V represent a query value, a key value and a vector value respectively; and dk represents a scaling factor;
proposing an LRF attention mechanism to capture time information and modal information in sensor activity images:
X′=ReLU(BN(Conv2d(X)))  (3)
wherein X′ represents a node output matrix in four dimensions, ReLU represents an activation function, and Conv2d represents a two-dimensional convolution operation;
obtaining a normalized output result of the node output matrix X′ by a layer normalization function, and further strengthening a network anti-degradation ability by a shortcut link:
X″=X′+LRF(LN(X′))  (4)
wherein a symbol LRF and a symbol LN represent a leak-kernel receptive field attention mechanism and the layer normalization function respectively;
outputting a feedforward network comprising a multilayer perceptron and a normalization layer by formula (4):
X″′=X″+MLP(L N(X′))  (5)
wherein symbol MLP and LN represent the multilayer perceptron and a layer normalization respectively.