CPC G06N 20/00 (2019.01) | 20 Claims |
8. A method, comprising:
receiving a machine learning processing job;
executing the machine learning processing job using parallel processing of multiple output pixels during a cycle at least by walking data across processing elements and broadcasting a single weight in a corresponding shared weight memory to a plurality of processing elements within a corresponding region for use in performing parallel multiplication operations, wherein each region of a plurality of regions includes a plurality of processing elements and a respective shared weight memory of a plurality of shared weight memories that is coupled to the plurality of processing elements in the corresponding region; and
generating an output indicating whether the machine learning processing job was successful or failed.
|