US 12,217,092 B2
	Electronic device and controlling method of electronic device
Hyukjin Jeong, Suwon-si (KR); and Jihwan Yeo, Suwon-si (KR)
Assigned to SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Apr. 6, 2023, as Appl. No. 18/131,629.
Application 18/131,629 is a continuation of application No. PCT/KR2022/010428, filed on Jul. 18, 2022.
Claims priority of application No. 10-2021-0098630 (KR), filed on Jul. 27, 2021; and application No. 10-2021-0117221 (KR), filed on Sep. 2, 2021.
Prior Publication US 2023/0244534 A1, Aug. 3, 2023
Int. Cl. G06F 9/44 (2018.01); G06F 8/41 (2018.01); G06F 9/50 (2006.01); G06F 11/34 (2006.01)

CPC G06F 9/5027 (2013.01) [G06F 8/4441 (2013.01); G06F 8/453 (2013.01); G06F 11/3423 (2013.01)]

18 Claims

1. An electronic apparatus comprising:

memory configured to store data corresponding to a neural network model;

a neural network accelerator comprising:

a buffer configured to temporarily store the data corresponding to the neural network model, and

a core configured to perform a computation on the neural network model based on the data stored in the buffer; and

at least one processor configured to:

determine a plurality of combinations comprising fused layers and non-fused layers based on a method of selecting and fusing adjacent layers of the neural network model,

based on a capacity of the buffer, determine a size of a tile capable of being processed in one computation in the core to acquire feature values output by the fused layers and the non-fused layers,

based on a first memory usage and computation time for storing the feature values in the buffer, determine whether to store the feature values in the memory,

based on determining the size of the tile and determining to store the feature values in the memory, identify a first combination among the plurality of combinations to be used in a computation of the neural network model by:

calculating a data transmission time between the buffer and the memory, and

calculating a computation time of the core,

convert the data corresponding to the neural network model into a first graph of a predetermined form,

convert the first graph into a second graph corresponding to the first combination, and

based on the second graph, generate code in which the data corresponding to the neural network model can be processed in the neural network accelerator.