US 11,868,873 B2
	Convolution operator system to perform concurrent convolution operations
Prasanna Venkatesh Balasubramaniyan, Chennai (IN); Sainarayanan Gopalakrishnan, Chennai (IN); and Gunamani Rajagopal, Chennai (IN)
Assigned to HCL Technologies Limited, New Delhi (IN)
Filed by HCL TECHNOLOGIES LIMITED, New Delhi (IN)
Filed on Dec. 29, 2020, as Appl. No. 17/136,547.
Prior Publication US 2021/0365703 A1, Nov. 25, 2021
Int. Cl. G06N 3/063 (2023.01); G06N 3/04 (2023.01); G06V 10/94 (2022.01); G06F 18/213 (2023.01)

CPC G06N 3/063 (2013.01) [G06F 18/213 (2023.01); G06N 3/04 (2013.01); G06V 10/95 (2022.01)]

13 Claims

1. A convolution operator system for performing convolution operation concurrently on an image, the convolution operator system comprising:

a Convolution Neural Network (CNN) reconfigurable engine including a plurality of Mini Parallel Rolling Engines (MPREs), wherein each MPRE includes:

an input router configured to receive image data comprising a kernel value and a set of input feature matrices, wherein each input feature matrix from the set of input feature matrices comprises a set of rows, and wherein each row from the set of rows comprises a set of input features;

a set of data flow control blocks configured to provide at least a portion of the input features and the kernel value to a set of computing blocks;

the set of computing blocks configured to perform a convolution operation concurrently on the set of input features in order to generate a convolution output corresponding to each row of each input feature matrix, wherein each computing block of the set of computing blocks performs the convolution operation based on the kernel value;

a controller configured to allocate a plurality of groups in order to generate a set of convolution output corresponding to the set of rows, wherein each group from the plurality of groups comprises one or more computing blocks of the set of computing blocks, wherein each group performs convolution operation concurrently one of (a) each row of each input feature matrix or (b) the set of rows of the input feature matrix, and wherein the plurality of groups is allocated based on the kernel value and the set of computing blocks available for the convolution operation to be performed;

a pipeline adder configured to generate an aggregated convolution output based on the set of convolution output when the plurality of groups is formed; and

an output router configured to receive either the aggregated convolution output or the convolution output, wherein the output router is further configured to transmit either the aggregated convolution output or the convolution output to the input router for subsequent convolution operation in order to generate a convolution result for the image data.