US 12,481,871 B2
Incremental learning system with selective weight updates
Donghyuk Kwon, Seoul (KR); Leesup Kim, Daejeon (KR); Jaekang Shin, Hwaseong-si (KR); and Seungkyu Choi, Daegu (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR); and Korea Advanced Institute of Science and Technology, Daejeon (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR); and Korea Advanced Institute of Science and Technology, Daejeon (KR)
Filed on Nov. 5, 2020, as Appl. No. 17/089,764.
Claims priority of application No. 10-2020-0041638 (KR), filed on Apr. 6, 2020; and application No. 10-2020-0090452 (KR), filed on Jul. 21, 2020.
Prior Publication US 2021/0312278 A1, Oct. 7, 2021
Int. Cl. G06N 3/08 (2023.01); G06N 3/0895 (2023.01); G06N 3/09 (2023.01); G06N 3/098 (2023.01); G06N 3/0985 (2023.01)
CPC G06N 3/08 (2013.01) [G06N 3/0895 (2023.01); G06N 3/09 (2023.01); G06N 3/098 (2023.01); G06N 3/0985 (2023.01)] 19 Claims
OG exemplary drawing
 
1. A processor-implemented neural network method, comprising:
setting a searching range of mask weights based on both of a distribution of the mask weights of a binary mask corresponding to a filter of a pretrained model and a learning rate-related parameter set in an incremental learning model;
identifying a targeted mask weight in the searching range of the mask weights;
calculating a weight gradient corresponding to the targeted mask weight based on an input activation of an input channel of a masked filter obtained from forward propagation of a training epoch process and a loss gradient obtained from back propagation of the training epoch process, wherein the masked filter is obtained by activating or deactivating each weight included in the filter based on the binary mask;
updating the targeted mask weight based on the weight gradient;
updating a portion of the binary mask corresponding to the updated targeted mask weight based on and a preset reference value; and
updating the masked filter by applying the updated the binary mask to the masked filter for a next training epoch process of the pretrained model,
wherein the setting of the searching range comprises:
when the distribution of the mask weights is obtained, setting the searching range based on a mean of the mask weights from the distribution of the mask weights; and
when it is determined that the learning rate-related parameter that determines a level of learning rate decay has changed according to predefined criteria, setting the searching range based on the level of learning rate decay,
wherein the training epoch process is included in a process for training the pretrained model, which is configured to perform a first task, to perform a second task,
wherein a non-target mask weight not in the searching range is not updated during the training epoch process and is fixed as a value determined in a previous training epoch, and
wherein a value of the binary mask corresponding to the non-targeted mask weight is fixed as a value determined in the previous training epoch.