CPC G06N 3/084 (2013.01) [G06F 9/5027 (2013.01); G06F 18/2148 (2023.01); G06N 3/063 (2013.01)] | 15 Claims |
1. A deep neural network training accelerator implemented in hardware and configured to increase learning speed and reduce learning energy, the deep neural network training accelerator comprising:
an operational unit, executed by the deep neural network training accelerator, to sequentially perform first and second operations on a plurality of input data of a sub-set according to a mini-batch gradient descent;
a determination unit, executed by the deep neural network training accelerator, to determine each of the plurality of input data as one of skip data and training data based on a confidence matrix obtained by the first operation; and
a control unit, executed by the deep neural network training accelerator, to control the operational unit to skip the second operation with respect to the skip data,
wherein the determination unit comprises a comparator, the comparator executed by the deep neural network training accelerator to:
compare a largest element among elements of the confidence matrix with a predetermined threshold,
output a low signal corresponding to the skip data to the control unit when a value of the largest element is equal to or greater than the predetermined threshold, and output a high signal corresponding to the training data to the control unit when a value of the largest element is smaller than the predetermined threshold, and
wherein the control unit, executed by the deep neural network training accelerator, to output a parallelization control signal to the operation unit in response to the low signal being received to control the operation unit to skip the second operation on the skip data and perform the second operation on the training data based on the parallelization control signal,
wherein the performing of the second operation comprises:
initializing, by the operational unit, any one operational device corresponding to the skip data among operational devices in response to the parallelization control signal;
reassigning, by the operational unit, a portion of each of the training data assigned to the other operational devices to the any one operational device; and
processing, by the operational unit, the second operation with respect to the training data in parallel using the operational devices after a predetermined time has elapsed from a time at which the first operation is performed.
|