US 12,236,665 B2
	Method and apparatus with neural network training
Hee Min Choi, Seoul (KR); and Hyoa Kang, Seoul (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Dec. 14, 2021, as Appl. No. 17/550,184.
Claims priority of application No. 10-2021-0009670 (KR), filed on Jan. 22, 2021; and application No. 10-2021-0061877 (KR), filed on May 13, 2021.
Prior Publication US 2022/0237890 A1, Jul. 28, 2022
Int. Cl. G06V 10/762 (2022.01); G06N 3/045 (2023.01); G06V 20/58 (2022.01)

CPC G06V 10/762 (2022.01) [G06N 3/045 (2023.01); G06V 20/58 (2022.01)]

25 Claims

1. A processor-implemented method with neural network training, comprising:

determining first backbone feature data corresponding to each input data by applying, to a first neural network model, two or more sets of the input data of the same scene, respectively;

determining second backbone feature data corresponding to each input data by applying, to a second neural network model, the two or more sets of the input data, respectively;

diversifying, using plural projection models and plural drop models, a view of the first backbone feature data output from the first neural network model and a view of the second backbone feature data output from the second neural network model, including:

determining, from the first backbone feature data, projection-based first embedded data using a first projection model and dropout-based first view data using a first drop model;

determining, from the second backbone feature data, projection-based second embedded data using a second projection model and dropout-based second view data using a second drop model; and

training either one or both of the first neural network model and the second neural network model based on a loss determined based on a combination of any two or more of the first embedded data, the first view data, the second embedded data, the second view data, and an embedded data clustering result.