CPC G06N 20/00 (2019.01) [G06F 9/3836 (2013.01); G06F 9/4881 (2013.01); G06F 9/5027 (2013.01); G06F 9/5061 (2013.01); G06F 9/5066 (2013.01)] | 20 Claims |
1. A method for implementing a machine learning network by executing a computer program of instructions on a machine learning accelerator (MLA) comprising a plurality of interconnected storage elements (SEs) and processing elements (PEs), the instructions partitioned into blocks, the method comprising:
retrieving a block k of instructions from off-chip memory, the block (a) comprising a set of statically scheduled deterministic instructions executed by the SEs and PEs, and (b) specifying a number Nk of non-deterministic instructions for block k that must execute prior to executing the block k of instructions, wherein the non-deterministic instructions are contained in prior blocks;
keeping a count of the number of non-deterministic instructions for block k executed; and
executing the block k of instructions, only after the count of executed non-deterministic instructions for block k has reached Nk, wherein an execution order of the statically scheduled deterministic instructions in the block k does not change as a result of run-time conditions, branching or dependence on input values.
|