US 11,989,935 B1
	Activity recognition method of LRF large-kernel attention convolution network based on large receptive field
Guangwei Hu, Jiangsu (CN); Qi Teng, Jiangsu (CN); Lei Pei, Jiangsu (CN); Wenwen Pan, Jiangsu (CN); Qi Huang, Jiangsu (CN); Qianyou Zhang, Jiangsu (CN); Cheng Luo, Jiangsu (CN); and Yun Liu, Jiangsu (CN)
Assigned to NANJING UNIVERSITY, Nanjing (CN); and NANJING MORENEW DATA CO. LTD., Nanjing (CN)
Filed by NANJING UNIVERSITY, Jiangsu (CN); and NANJING MORENEW DATA Co. Ltd., Jiangsu (CN)
Filed on Dec. 21, 2023, as Appl. No. 18/392,484.
Claims priority of application No. 202211695992.5 (CN), filed on Dec. 28, 2022.
Int. Cl. G06V 10/82 (2022.01); G06V 10/764 (2022.01)

CPC G06V 10/82 (2022.01) [G06V 10/764 (2022.01)]

6 Claims

1. An activity recognition method of an LRF (large receptive field) large kernel attention convolution network based on a large receptive field, comprising:

collecting an action signal, carrying out a preprocessing and a data partition on the action signal to obtain a data set; and

training an LRF large-kernel attention convolution network model based on the data set, and introducing a trained LRF large-kernel attention convolution network model into a mobile wearable recognition device for a human posture recognition;

wherein the LRF large-kernel attention convolution network model comprises:

a n LRF large-kernel attention convolution network with three layers and a fully connected classification output layer, wherein the LRF large-kernel attention convolution network comprises a local depth convolution layer, a long-distance depth expansion convolution layer and a 1×1 ordinary convolution layer for a feature extraction; and the fully connected classification output layer is used for an action classification;

a calculation method of the LRF large-kernel attention convolution network model comprises:

wherein X represents an input matrix, t represents a time step of the input matrix, and s represents a sensor mode the input matrix;

compressing the input matrix X into one-dimensional data, and introducing into a self-attention module, outputting a weighted sum of all value vectors, and using a Softmax function for a normalization;

wherein Q, K, and V represent a query value, a key value and a vector value respectively; and d_krepresents a scaling factor;

proposing an LRF attention mechanism to capture time information and modal information in sensor activity images:

X′=ReLU(BN(Conv2d(X))) (3)

wherein X′ represents a node output matrix in four dimensions, ReLU represents an activation function, and Conv2d represents a two-dimensional convolution operation;

obtaining a normalized output result of the node output matrix X′ by a layer normalization function, and further strengthening a network anti-degradation ability by a shortcut link:

X″=X′+LRF(LN(X′)) (4)

wherein a symbol LRF and a symbol LN represent a leak-kernel receptive field attention mechanism and the layer normalization function respectively;

outputting a feedforward network comprising a multilayer perceptron and a normalization layer by formula (4):

X″′=X″+MLP(L N(X′)) (5)

wherein symbol MLP and LN represent the multilayer perceptron and a layer normalization respectively.