US 11,907,826 B2
Electronic apparatus for operating machine learning and method for operating machine learning
Kyoung-Hoon Kim, Gyeonggi-do (KR); Young-hwan Park, Gyeonggi-do (KR); Ki-seok Kwon, Seoul (KR); Suk-jin Kim, Seoul (KR); Chae-seok Im, Gyeonggi-do (KR); Han-su Cho, Gyeonggi-do (KR); Sang-bok Han, Gyeonggi-do (KR); Seung-won Lee, Gyeonggi-do (KR); and Kang-jin Yoon, Seoul (KR)
Assigned to Samsung Electronics Co., Ltd
Filed by Samsung Electronics Co., Ltd., Gyeonggi-do (KR)
Filed on Mar. 23, 2018, as Appl. No. 15/934,341.
Claims priority of application No. 10-2017-0036715 (KR), filed on Mar. 23, 2017.
Prior Publication US 2018/0276532 A1, Sep. 27, 2018
Int. Cl. G06N 3/045 (2023.01); G10L 15/16 (2006.01); G06N 3/063 (2023.01); G06N 20/00 (2019.01); G06N 3/044 (2023.01)
CPC G06N 3/045 (2023.01) [G06N 3/044 (2023.01); G06N 3/063 (2013.01); G06N 20/00 (2019.01); G10L 15/16 (2013.01)] 9 Claims
OG exemplary drawing
 
1. An electronic apparatus for performing machine learning, the electronic apparatus comprising:
an integrated circuit configured to include a plurality of processing elements arranged in a predetermined pattern and share data between the plurality of processing elements which are adjacent to each other to perform an operation; and
a processor configured to:
control the integrated circuit to perform a convolution operation by applying a filter to input data,
wherein the processor is configured to divide a two-dimensional filter into a plurality of elements which is one dimensional data by arranging the plurality of elements including the two-dimensional filter in a predetermined order,
identify a plurality of elements, except for at least one element having a zero value among the plurality of elements,
control the integrated circuit to perform the convolution operation by inputting each of the identified plurality of elements to the plurality of processing elements in the predetermined order and sequentially applying the identified plurality of elements to the input data,
perform an operation of multiplying a first element of the identified plurality of elements with each of a plurality of first data values belonging to a first row of the input data, perform an operation of multiplying the first element of the identified plurality of elements with each of a plurality of second data values belonging to a second row of the input data, and an operation of multiplying a second element of the identified plurality of elements with each of the plurality of first data values,
perform an operation of multiplying the second element with each of the plurality of second data values,
when the operation for the first element is completed and the operation for the second elements starts in the first row, shift a plurality of operation values for the first element in a predetermined first direction by a predetermined first interval,
perform an accumulation for operation values for the first element and operation values for the second element and obtain first accumulation values, and
shift the first accumulation values in a predetermined second direction by a predetermined second interval in the integrated circuit,
wherein the predetermined first direction is a direction for a position in which the second element is disposed based on a position in which the first element is disposed in the two-dimensional filter and the predetermined first interval is determined according to a number of the zero values existing between the position of the first element and the position of the second element in the plurality of elements arranged in a predetermined order,
wherein the predetermined second direction is a direction in which a third element is disposed based on the second element in the two-dimensional filter and the predetermined second interval is determined based on a number of the zero values existing between the second element and the third element,
wherein the plurality of processing elements form a network having a structure in which a tree topology network is coupled to a mesh topology network,
wherein the processor is further configured to control the plurality of processing elements to perform operations according to a convolutional neural network (CNN) algorithm and a recurrent neural network (RNN) algorithm using the network having the coupled structure, and
wherein the processor is further configured to control the plurality of processing elements to perform operations according to the mesh topology network in a convolution layer and a pooling layer of the CNN algorithm and perform an operation according to the tree topology network in a fully connected layer of the CNN algorithms and each layer of the RNN algorithm.