US 11,989,259 B2
Low latency matrix multiply unit
Andrew Everett Phelps, Middleton, WI (US); and Norman Paul Jouppi, Palo Alto, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Nov. 10, 2022, as Appl. No. 17/985,069.
Application 17/985,069 is a continuation of application No. 16/830,894, filed on Mar. 26, 2020, granted, now 11,500,961.
Application 16/830,894 is a continuation of application No. 15/983,043, filed on May 17, 2018, granted, now 10,635,740, issued on Apr. 28, 2020.
Claims priority of provisional application 62/507,766, filed on May 17, 2017.
Prior Publication US 2023/0267171 A1, Aug. 24, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 17/16 (2006.01); G06F 5/01 (2006.01); G06F 9/30 (2018.01); G06F 15/80 (2006.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01)
CPC G06F 17/16 (2013.01) [G06F 5/015 (2013.01); G06F 9/30101 (2013.01); G06F 15/8046 (2013.01); G06F 9/30032 (2013.01); G06N 3/04 (2013.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A matrix multiply unit implemented as a two-dimensional systolic array comprising:
a plurality of cells arranged in columns of the systolic array;
at least one chain of weight shift registers per column of the systolic array; wherein each weight shift register is connected to only one chain and each cell is connected to only one weight shift register of the at least one chain, the at least one chain having two injection points for injecting weight values, one at the top of the column, and the other at a second point in the column;
a weight matrix register per cell configured to store a weight input received from a weight shift register; and
a multiply unit that is coupled to the weight matrix register and configured to multiply the weight input of the weight matrix register with a vector data input to obtain a multiplication result.