CPC G11C 11/34 (2013.01) [G06F 9/30007 (2013.01); G06F 9/3877 (2013.01); G06F 9/3893 (2013.01); G06F 9/5027 (2013.01); G06F 17/16 (2013.01); G06N 3/063 (2013.01); G06N 3/10 (2013.01)] | 19 Claims |
1. A device, comprising:
an integrated circuit package enclosing components of the device, the components enclosed within the integrated circuit package including:
an accelerator for deep learning, the accelerator having:
at least one processing unit configured to execute instructions, each of the instructions having one or more matrix operands and configured to instruct the at least one processing unit to perform an operation on the one or more matrix operands;
a control unit;
local memory; and
a memory interface;
random access memory configured to have:
a first region configured to store the instructions and store matrices of an artificial neural network, the instructions executable by the at least one processing unit of the accelerator;
a second region configured to store inputs to the artificial neural network; and
a third region configured to store outputs generated by the accelerator autonomously executing the instructions to process, using the matrices stored in the first region, the inputs in the second region, wherein the control unit is configured to load, in response to the inputs being written into the second region, the instructions from the first region of the random access memory for execution by the at least one processing unit; and
at least two interfaces configured to access, via a connection between the memory interface of the accelerator and the random access memory, the random access memory concurrently by at least two devices that are external to the device, wherein the at least two interfaces include:
a first interface coupled to the third region and configured to provide a central processing unit configured outside of the integrated circuit package with access to obtain the outputs from the third region; and
a second interface coupled to the second region and configured to provide a direct memory access controller configured outside of the integrated circuit package with access to write the inputs into the second region.
|