| CPC G06N 3/08 (2013.01) [G06N 3/04 (2013.01)] | 15 Claims |

|
1. An operation method for a network layer in a Deep Neural Network, wherein the method is applied to target detection and segmentation, behavior detection and recognition or voice recognition and performed by an electronic device, wherein the electronic device includes a processor and a machine-readable storage medium, the processor of the electronic device, by reading machine-executable instructions stored in the machine-readable medium and executing the machine-executable instructions, implements the following operations comprising:
acquiring a weighted tensor of the network layer in the Deep Neural Network, wherein the weighted tensor comprises a plurality of filters;
for each of the filters of the network layer, converting the filter into a linear combination of a plurality of fixed-point convolution kernels by splitting the filter, wherein a weight value of each of the fixed-point convolution kernels is a fixed-point quantized value having a specified bit-width;
for each of filters of the network layer, performing a convolution operation on input data of the network layer and each of the fixed-point convolution kernels of the filter, respectively, to obtain a plurality of convolution results, and calculating a weighted sum of the obtained convolution results based on the linear combination of the plurality of fixed-point convolution kernels of the filter to obtain an operation result of the filter; and
determining output data of the network layer, which is composed of the obtained operation results of the filters;
wherein, the network layer comprises a convolution layer, and the size of the weighted tensor of the convolution layer is S×S×I×O, wherein O represents the number of filters included in the weighted tensor of the convolutional layer;
for each of the filters of the network layer, converting the filter into the linear combination of the plurality of fixed-point convolution kernels by splitting the filter comprises:
for each filter of the convolution layer, converting the filter into the linear combination of the plurality of fixed-point convolution kernels by the splitting the filter based on a preset splitting formula; wherein, the preset splitting formula is:
![]() wherein, wi is an ith filter of the convolution layer, i∈[1,O], p is the number of the fixed-point convolution kernels obtained by splitting the filter wi, αj is a preset linear weighting coefficient of a jth fixed-point convolution kernel, tj is the jth fixed-point convolution kernel and the size of tj is S×S×I, B is the preset quantized number of bits, and bj is a specified bit-width corresponding to the jth fixed-point convolution kernel.
|