US 11,836,102 B1
Low latency and high bandwidth artificial intelligence processor
Amrita Mathuriya, Portland, OR (US); Rajeev Kumar Dokania, Beaverton, OR (US); Ananda Samajdar, Hillsboro, OR (US); and Sasikanth Manipatruni, Portland, OR (US)
Assigned to KEPLER COMPUTING INC., San Francisco, CA (US)
Filed by Kepler Computing Inc., San Francisco, CA (US)
Filed on Mar. 18, 2020, as Appl. No. 16/823,209.
Claims priority of provisional application 62/821,328, filed on Mar. 20, 2019.
Int. Cl. G06F 7/52 (2006.01); G06F 13/40 (2006.01); G06F 17/16 (2006.01); G06F 7/50 (2006.01); G06F 13/16 (2006.01); G11C 11/22 (2006.01); G06N 20/00 (2019.01); H01L 25/065 (2023.01); G06N 3/063 (2023.01)
CPC G06F 13/4027 (2013.01) [G06F 7/50 (2013.01); G06F 13/1668 (2013.01); G06F 17/16 (2013.01); G06N 20/00 (2019.01); G11C 11/22 (2013.01); H01L 25/0657 (2013.01); G06N 3/063 (2013.01)] 37 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a substrate;
a first die on the substrate, the first die including a plurality of a random access memory (RAM) tiles to store input data, weight factors, and outputs;
a second die over the first die, wherein the first die is between the substrate and the second die; and
a heat sink over the second die, wherein the second die is between the heat sink and the first die, wherein the substrate is at a reference level of an x-y plane, wherein the first die is positioned above the reference level along a positive z-axis at a first z-plane, wherein the second die is positioned above the first z-plane along the positive z-axis at a second z-plane, wherein the second z-plane is higher than the first z-plane along an x-axis and relative to the reference level, wherein the second die includes a plurality of compute tiles, wherein an individual compute tile is substantially vertically aligned with an individual RAM tile of the plurality of RAM tiles, wherein the individual RAM tile includes non-linear polar material, and wherein the individual compute tile includes:
a matrix multiplier communicatively coupled to one or more RAM tiles of the first die; and
a buffer communicatively coupled to the one or more RAM tiles of the first die.