US 12,190,230 B2
Computation of neural network node by neural network inference circuit
Kenneth Duong, San Jose, CA (US); Jung Ko, San Jose, CA (US); and Steven L. Teig, Menlo Park, CA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Perceive Corporation, San Jose, CA (US)
Filed on Nov. 7, 2022, as Appl. No. 17/982,474.
Application 17/982,474 is a continuation of application No. 16/212,616, filed on Dec. 6, 2018, granted, now 11,501,138.
Claims priority of provisional application 62/773,164, filed on Nov. 29, 2018.
Claims priority of provisional application 62/773,162, filed on Nov. 29, 2018.
Claims priority of provisional application 62/753,878, filed on Oct. 31, 2018.
Claims priority of provisional application 62/742,802, filed on Oct. 8, 2018.
Claims priority of provisional application 62/724,589, filed on Aug. 29, 2018.
Claims priority of provisional application 62/660,914, filed on Apr. 20, 2018.
Prior Publication US 2023/0063274 A1, Mar. 2, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/063 (2023.01); G06F 1/03 (2006.01); G06F 5/01 (2006.01); G06F 7/544 (2006.01); G06F 9/30 (2018.01); G06F 17/10 (2006.01); G06F 17/16 (2006.01); G06N 3/048 (2023.01); G06N 3/06 (2006.01); G06N 3/08 (2023.01); G06N 3/084 (2023.01); G06N 5/04 (2023.01); G06N 5/046 (2023.01); G06N 20/00 (2019.01)
CPC G06N 3/063 (2013.01) [G06F 1/03 (2013.01); G06F 5/01 (2013.01); G06F 7/5443 (2013.01); G06F 9/30098 (2013.01); G06F 9/30145 (2013.01); G06F 17/10 (2013.01); G06F 17/16 (2013.01); G06N 3/048 (2023.01); G06N 3/06 (2013.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01); G06N 5/04 (2013.01); G06N 5/046 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. For a neural network inference circuit, a method for executing a neural network that comprises a plurality of computation nodes, each of a set of the computation nodes comprising a dot product of input values and weight values, the method comprising:
to compute a particular computation node:
at each respective dot product core circuit of a plurality of dot product core circuits of the neural network inference circuit, computing a respective partial dot product using a respective set of input values and a respective set of weight values stored in a respective set of memories of the respective dot product core circuit; and
at a bus of the neural network inference circuit that comprises a plurality of aggregation circuits, combining the partial dot products computed by the plurality of dot product core circuits to compute the dot product for the particular computation node.