US 12,455,737 B2
Neural network compute tile
Olivier Temam, Antony (FR); Ravi Narayanaswami, San Jose, CA (US); Harshit Khaitan, San Jose, CA (US); and Dong Hyuk Woo, San Jose, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Nov. 9, 2023, as Appl. No. 18/505,743.
Application 18/505,743 is a continuation of application No. 17/892,807, filed on Aug. 22, 2022, granted, now 11,816,480.
Application 17/892,807 is a continuation of application No. 16/239,760, filed on Jan. 4, 2019, granted, now 11,422,801, issued on Aug. 23, 2022.
Application 16/239,760 is a continuation of application No. 15/335,769, filed on Oct. 27, 2016, granted, now 10,175,980, issued on Jan. 8, 2019.
Prior Publication US 2024/0231819 A1, Jul. 11, 2024
Int. Cl. G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06F 13/28 (2006.01); G06N 3/04 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2023.01)
CPC G06F 9/3001 (2013.01) [G06F 9/30036 (2013.01); G06F 9/30065 (2013.01); G06F 9/3824 (2013.01); G06F 13/28 (2013.01); G06N 3/04 (2013.01); G06N 3/045 (2023.01); G06N 3/063 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A hardware integrated circuit configured to implement a neural network, the integrated circuit comprising:
a first memory configured to store a first operand;
a second memory configured to store a second operand;
a linear computing unit comprising a plurality of multiplication cells having inputs that are coupled to the first memory and the second memory;
an output activation pipeline coupled to outputs of the plurality of multiplication cells, wherein the output activation pipeline includes at least one pipelined shift register; and
a non-linear unit coupled between the first memory and the at least one pipelined shift register, wherein the non-linear unit is configured to:
apply an activation function to the output of the linear computing unit that is shifted out of the at least one pipelined shift register, wherein the output comprises an accumulated value corresponding to a product of multiplying the first operand with the second operand using a multiplication cell of the linear computing unit; and
write, to the first memory, activations resulting from the applying of the activation function to the output of the linear computing unit.