| CPC G06N 3/063 (2013.01) [G06F 13/4022 (2013.01)] | 19 Claims |

|
1. A system comprising:
a high bandwidth memory (HBM) device comprising random access memory (RAM) distributed over multiple dies that are arranged in a stacked configuration, a first portion of the RAM is configured as virtual banks of RAM dedicated to storing feature map data of a convolutional neural network (CNN) application program, and wherein a second portion of the RAM is dedicated to supporting data exchanges with a host;
a CNN engine comprising:
a convolutional instruction processor configured to execute convolutional layer instructions of the CNN application program; and
a depthwise convolutional instruction processor configured to execute depthwise layer instructions of the CNN application program; and
point-to-point interface circuitry configured to permit the convolutional instruction processor and the depthwise convolutional instruction processor to access respective first and second sets of the virtual banks of the RAM;
wherein the convolutional instruction processor and the depthwise convolutional instruction processor are further configured to write and read feature map data of the CNN application program to and from the respective first and second sets of the virtual banks of RAM via the point-to-point interface circuitry.
|