US 11,899,745 B1
Systems and methods for speech or text processing using matrix operations
Alagappan Valliappan, Redmond, WA (US); Ganesh Venkatesh, San Jose, CA (US); and Pierce I-Jen Chuang, Sunnyvale, CA (US)
Assigned to Meta Platforms Technologies, LLC, Menlo Park, CA (US)
Filed by Meta Platforms Technologies, LLC, Menlo Park, CA (US)
Filed on Aug. 19, 2020, as Appl. No. 16/997,401.
Int. Cl. G06F 17/16 (2006.01); G06F 7/544 (2006.01)
CPC G06F 17/16 (2013.01) [G06F 7/5443 (2013.01)] 20 Claims
OG exemplary drawing
 
12. An integrated circuit comprising:
a processor comprising a plurality of multiply-accumulate (MAC) units; a load store memory connected to the processor;
a plurality of memory each comprising a lookup table, the plurality of memory connected in parallel to the processor; and
the processor configured to:
partition an input of a first data format across a plurality of lookup tables each residing in a respective memory;
read weight information from the load store memory and the partitioned input on a per column basis from the plurality of lookup tables;
perform a number of MAC operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables, the number of MAC operations performed per cycle corresponding to a total number of columns of the plurality of lookup tables; and
generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.