CPC G06F 17/16 (2013.01) [G06F 7/5443 (2013.01)] | 20 Claims |
12. An integrated circuit comprising:
a processor comprising a plurality of multiply-accumulate (MAC) units; a load store memory connected to the processor;
a plurality of memory each comprising a lookup table, the plurality of memory connected in parallel to the processor; and
the processor configured to:
partition an input of a first data format across a plurality of lookup tables each residing in a respective memory;
read weight information from the load store memory and the partitioned input on a per column basis from the plurality of lookup tables;
perform a number of MAC operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables, the number of MAC operations performed per cycle corresponding to a total number of columns of the plurality of lookup tables; and
generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.
|