US 12,136,039 B1
	Optimizing global sparsity for neural network
Eric A. Sather, Palo Alto, CA (US); and Steven L. Teig, Menlo Park, CA (US)
Assigned to PERCEIVE CORPORATION, San Jose, CA (US)
Filed by Perceive Corporation, San Jose, CA (US)
Filed on Jul. 7, 2020, as Appl. No. 16/923,004.
Application 16/923,004 is a continuation in part of application No. 16/684,128, filed on Nov. 14, 2019, granted, now 11,995,533.
Claims priority of provisional application 62/975,539, filed on Feb. 12, 2020.
Claims priority of provisional application 62/955,349, filed on Dec. 30, 2019.
Claims priority of provisional application 62/949,082, filed on Dec. 17, 2019.
Claims priority of provisional application 62/926,382, filed on Oct. 25, 2019.
Claims priority of provisional application 62/775,886, filed on Dec. 5, 2018.
Int. Cl. G06N 3/084 (2023.01); G06N 3/048 (2023.01); G06N 3/063 (2023.01); G06N 5/046 (2023.01)

CPC G06N 3/084 (2013.01) [G06N 3/048 (2023.01); G06N 3/063 (2013.01); G06N 5/046 (2013.01)]

19 Claims

1. A method for training a plurality of parameters of a machine-trained (MT) network comprising a plurality of layers of processing nodes, the plurality of parameters subject to a sparsity constraint that requires a threshold portion of the parameters to be equal to zero, wherein a first set of the parameters subject to the sparsity constraint are grouped into groups of parameters, the method comprising:

adjusting values of the plurality of parameters by (i) propagating a plurality of training inputs through the plurality of layers of the MT network to generate network outputs, (ii) calculating a value of a loss function based at least in part on differences between the generated network outputs and expected network outputs, and (iii) using the calculated loss function value to modify the values of the first set of parameters and a second set of the parameters subject to the sparsity constraint;

for each parameter of the second set of the parameters subject to the sparsity constraint, determining an accuracy penalty associated with the parameter being set to zero from the adjusted value for the parameter, wherein (i) the parameters are organized by the layers of the MT network, (ii) each parameter has a set of allowed values to which the parameter can be set, (iii) for all parameters, the set of allowed values includes the value zero, and (iv) the sets of allowed values are the same for all parameters within a respective layer of the MT network; for each group of parameters in the first set of parameters, determining a minimum accuracy penalty for each possible number of parameters in the group being set to zero from the adjusted values for the parameters; and

using the determined accuracy penalties to set to the value zero at least the threshold portion of the plurality of adjusted parameters such that the MT network fulfills the sparsity constraint to be executable on a particular type of limited-memory neural network inference circuit having sparsity requirements based on the neural network inference circuit using less memory for parameters set to the value zero than for parameters set to non-zero values.