US 12,327,191 B2
Method and apparatus with neural network pruning
Won-Jo Lee, Suwon-si (KR); Youngmin Oh, Suwon-si (KR); and Minkyoung Cho, Incheon (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Apr. 21, 2021, as Appl. No. 17/236,503.
Claims priority of application No. 10-2020-0132151 (KR), filed on Oct. 13, 2020.
Prior Publication US 2022/0114453 A1, Apr. 14, 2022
Int. Cl. G06N 3/08 (2023.01); G06F 18/21 (2023.01); G06N 3/082 (2023.01); G06V 10/75 (2022.01)
CPC G06N 3/082 (2013.01) [G06F 18/217 (2023.01); G06V 10/751 (2022.01)] 17 Claims
OG exemplary drawing
 
1. A neural network pruning method, the method comprising:
acquiring a first task accuracy of an inference task processed by a pretrained neural network;
pruning, based on a channel unit, the neural network by adjusting weights between nodes of channels based on a preset learning weight and based on a channel-by-channel pruning parameter corresponding to a channel of each of a plurality of layers of the pretrained neural network;
updating the learning weight based on the first task accuracy and a task accuracy of the pruned neural network;
updating the channel-by-channel pruning parameter based on the updated learning weight and the task accuracy of the pruned neural network;
repruning, based on the channel unit, the pruned neural network based on the updated learning weight and based on the updated channel-by-channel pruning parameter;
repeatedly performing a pruning-evaluation operation of determining a task accuracy of the repruned neural network and a learning weight of the repruned neural network;
comparing the determined learning weight to a lower limit threshold of the learning weight, in response to repeatedly performing the pruning-evaluation operation; and
determining whether to terminate a current pruning session and to initiate a subsequent pruning session in which the learning weight is set as an initial reference value, based on a result of the comparing.