US 12,271,321 B2
AI accelerator apparatus using in-memory compute chiplet devices for transformer workloads
Sudeep Bhoja, Santa Clara, CA (US); and Siddharth Sheth, Santa Clara, CA (US)
Assigned to d-MATRIX CORPORATION, Santa Clara, CA (US)
Filed by d-MATRIX CORPORATION, Santa Clara, CA (US)
Filed on Oct. 24, 2023, as Appl. No. 18/493,616.
Application 18/493,616 is a continuation of application No. 17/538,923, filed on Nov. 30, 2021, granted, now 11,847,072.
Prior Publication US 2024/0241841 A1, Jul. 18, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 13/16 (2006.01); G06F 1/10 (2006.01); G06F 13/42 (2006.01)
CPC G06F 13/1668 (2013.01) [G06F 1/10 (2013.01); G06F 13/4291 (2013.01); G06F 2213/0026 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An AI accelerator apparatus, the apparatus comprising:
a global CPU coupled to one or more chiplets and configured to receive a plurality of matrix inputs; wherein each of the chiplets comprises a plurality of tiles; wherein each of the tiles comprises a plurality of slices, a CPU coupled to the plurality of slices, and a hardware dispatch device coupled to the CPU; and wherein each of the plurality of slices includes a digital in memory compute (DIMC) device coupled to a clock;
wherein each of the CPUs of the plurality of tiles is configured to receive a portion of the plurality of matrix inputs from the global CPU via a global reduced instruction set computer (RISC) interface; and
wherein each of the digital in memory compute (DIMC) devices is configured to perform a throughput of one or more matrix computations using one or more of the plurality of matrix inputs such that the throughput is characterized by a plurality of multiply accumulates per a clock cycle.