US 12,260,336 B2
	Training a neural network
Muhammad Asad, Hertfordshire (GB); Elia Condorelli, Hertfordshire (GB); and Cagatay Dikici, Hertfordshire (GB)
Assigned to Imagination Technologies Limited, Kings Langley (GB)
Filed by Imagination Technologies Limited, Kings Langley (GB)
Filed on Dec. 22, 2021, as Appl. No. 17/559,331.
Claims priority of application No. 2020386 (GB), filed on Dec. 22, 2020.
Prior Publication US 2022/0261652 A1, Aug. 18, 2022
Int. Cl. G06N 3/084 (2023.01); G06F 18/2136 (2023.01); G06N 3/082 (2023.01)

CPC G06N 3/084 (2013.01) [G06F 18/2136 (2023.01); G06N 3/082 (2013.01)]

20 Claims

1. A computer implemented method of training a neural network configured to combine a set of coefficients with respective input data values, the method comprising:

so as to train a test implementation of the neural network:

applying sparsity,

wherein applying sparsity comprises:

dividing the set of coefficients into multiple groups of coefficients, wherein

each group comprises a plurality of coefficients of the set of coefficients;

representing each group of coefficients of the multiple groups of coefficients by a respective single value;

modelling those values using a differentiable function representative of an extreme value distribution so as to identify a threshold value in dependence on a sparsity parameter, wherein the sparsity parameter indicates a level of sparsity to be applied to the set of coefficients; and

applying sparsity to a plurality of groups of coefficients of the multiple groups of coefficients in dependence on that threshold value, wherein applying sparsity to each group of coefficients of the plurality of groups of coefficients comprises setting each of the coefficients in that group to zero;

operating the test implementation of the neural network on training input data using the coefficients so as to form training output data;

in dependence on the training output data, assessing the accuracy of the neural network; and

updating the sparsity parameter in dependence on the accuracy of the neural network by modifying the threshold value by backpropagating one or more gradient vectors using the differentiable function; and

configuring a runtime implementation of the neural network in dependence on the updated sparsity parameter that, when implemented at a data processing system, executes the runtime implementation of neural network in dependence on the updated sparsity parameter.