| CPC G06N 3/082 (2013.01) [G06F 18/217 (2023.01); G06V 10/751 (2022.01)] | 17 Claims |

|
1. A neural network pruning method, the method comprising:
acquiring a first task accuracy of an inference task processed by a pretrained neural network;
pruning, based on a channel unit, the neural network by adjusting weights between nodes of channels based on a preset learning weight and based on a channel-by-channel pruning parameter corresponding to a channel of each of a plurality of layers of the pretrained neural network;
updating the learning weight based on the first task accuracy and a task accuracy of the pruned neural network;
updating the channel-by-channel pruning parameter based on the updated learning weight and the task accuracy of the pruned neural network;
repruning, based on the channel unit, the pruned neural network based on the updated learning weight and based on the updated channel-by-channel pruning parameter;
repeatedly performing a pruning-evaluation operation of determining a task accuracy of the repruned neural network and a learning weight of the repruned neural network;
comparing the determined learning weight to a lower limit threshold of the learning weight, in response to repeatedly performing the pruning-evaluation operation; and
determining whether to terminate a current pruning session and to initiate a subsequent pruning session in which the learning weight is set as an initial reference value, based on a result of the comparing.
|