US 11,948,070 B2
Hardware implementation of a convolutional neural network
Clifford Gibson, St. Albans (GB); and James Imber, Hemel Hempstead (GB)
Assigned to Imagination Technologies Limited, Kings Langley (GB)
Filed by Imagination Technologies Limited, Kings Langley (GB)
Filed on Apr. 10, 2023, as Appl. No. 18/132,929.
Application 18/132,929 is a continuation of application No. 15/585,571, filed on May 3, 2017, granted, now 11,625,581, issued on Apr. 11, 2023.
Claims priority of application No. 1607713 (GB), filed on May 3, 2016.
Prior Publication US 2023/0306248 A1, Sep. 28, 2023
Int. Cl. G06N 3/063 (2023.01); G06F 7/00 (2006.01); G06F 7/544 (2006.01); G06N 3/04 (2023.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01)
CPC G06N 3/063 (2013.01) [G06F 7/00 (2013.01); G06F 7/5443 (2013.01); G06N 3/04 (2013.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. Hardware to implement a convolutional neural network (CNN), comprising:
a memory interface configured to receive, from external memory, weights and input data to be used in calculations within the CNN, as well as command information to control operation of the hardware;
a coefficient buffer controller configured to receive the weights and pass the weights to a coefficient buffer, the coefficient buffer configured to store the weights received from the coefficient buffer controller;
an input buffer controller configured to receive the input data and pass the input data to a plurality of input buffers, the plurality of input buffers configured to store the input data received from the input buffer controller;
a command decoder configured to decode the command information and subsequently issue control information to the coefficient buffer controller and the input buffer controller to control a manner in which the weights and input data are stored in the coefficient buffer and the plurality of input buffers respectively;
a plurality of convolution engines configured to perform one or more convolution operations on the input data in the plurality of input buffers using the weights in the coefficient buffer;
a plurality of accumulators configured to receive results of the plurality of convolution engines and add the results of the convolution engines to values stored in an accumulation buffer, the accumulation buffer configured to store accumulated results from the plurality of accumulators;
a shared buffer;
an activation module configured to perform at least one of a number of different activation functions on data in the accumulation buffer and store the results in the shared buffer;
a normalize module configured to perform one of a number of different normalizing functions on data in the shared buffer and store the results in the shared buffer; and
a pool module configured to perform a pooling operation on data in the shared buffer and store the results in the shared buffer.