US 11,687,831 B1
Method, product, and apparatus for a multidimensional processing array for hardware acceleration of convolutional neural network inference
Ngai Ngai William Hung, San Jose, CA (US); Dhiraj Goswami, Wilsonville, OR (US); Michael Patrick Zimmer, Chicago, IL (US); and Yong Liu, Cupertino, CA (US)
Assigned to Cadence Design Systems, Inc., San Jose, CA (US)
Filed by Cadence Design Systems, Inc., San Jose, CA (US)
Filed on Jun. 30, 2020, as Appl. No. 16/946,674.
Int. Cl. G06N 20/00 (2019.01)
CPC G06N 20/00 (2019.01) 20 Claims
OG exemplary drawing
 
8. A method, comprising:
receiving a machine learning processing job;
executing the machine learning processing job using parallel processing of multiple output pixels during a cycle at least by walking data across processing elements and broadcasting a single weight in a corresponding shared weight memory to a plurality of processing elements within a corresponding region for use in performing parallel multiplication operations, wherein each region of a plurality of regions includes a plurality of processing elements and a respective shared weight memory of a plurality of shared weight memories that is coupled to the plurality of processing elements in the corresponding region; and
generating an output indicating whether the machine learning processing job was successful or failed.