US 11,915,117 B2
	Reduced complexity convolution for convolutional neural networks
Manu Mathew, Bangalore (IN); Kumar Desappan, Bangalore (IN); and Pramod Kumar Swami, Bangalore (IN)
Assigned to Texas Instruments Incorporated, Dallas, TX (US)
Filed by TEXAS INSTRUMENTS INCORPORATED, Dallas, TX (US)
Filed on May 24, 2021, as Appl. No. 17/327,988.
Application 17/327,988 is a continuation of application No. 15/800,294, filed on Nov. 1, 2017, granted, now 11,048,997.
Claims priority of application No. 201641044431 (IN), filed on Dec. 27, 2016.
Prior Publication US 2021/0279550 A1, Sep. 9, 2021
Int. Cl. G06N 3/04 (2023.01); G06N 3/08 (2023.01); G06N 3/082 (2023.01); G06N 3/084 (2023.01); G06F 17/15 (2006.01); G06F 7/544 (2006.01); G06N 3/045 (2023.01); G06N 3/10 (2006.01)

CPC G06N 3/04 (2013.01) [G06F 7/5443 (2013.01); G06F 17/15 (2013.01); G06F 17/153 (2013.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01); G06N 3/082 (2013.01); G06N 3/084 (2013.01); G06N 3/10 (2013.01)]

20 Claims

1. A method comprising:

performing, using an electronic device, a training of a convolutional neural network (CNN) to generate coefficients associated with a layer of the CNN;

tuning, using the electronic device, the layer of the CNN by at least:

selecting some of the coefficients based on a sparsity target representative of a ratio of a number of zero coefficients to a total number of the coefficients of the layer; and

setting the selected coefficients to zero; and

for each respective nonzero coefficient of the coefficients, performing, using the electronic device, a block multiply accumulation (BMA) operation, wherein the BMA operation multiplies the respective nonzero coefficient with an input data block corresponding to the respective nonzero coefficient.