US 11,687,759 B2
Neural network accelerator
Ivo Leonardus Coenen, Coffrane (CH); and Dennis Wayne Mitchler, Marin-Epagnier (CH)
Assigned to SEMICONDUCTOR COMPONENTS INDUSTRIES, LLC, Scottsdale, AZ (US)
Filed by SEMICONDUCTOR COMPONENTS INDUSTRIES, LLC, Phoenix, AZ (US)
Filed on Apr. 16, 2019, as Appl. No. 16/385,192.
Claims priority of provisional application 62/665,318, filed on May 1, 2018.
Prior Publication US 2019/0340493 A1, Nov. 7, 2019
Int. Cl. G06N 3/04 (2023.01); G06N 3/08 (2023.01)
CPC G06N 3/04 (2013.01) [G06N 3/08 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A method for implementing a neural network, the method comprising:
receiving input data;
fetching, from a memory, weights of the neural network;
performing a first portion of processing for the neural network, the first portion implemented in hardware by an accelerator including a plurality of parallel multiply and accumulate (MAC) units configured to perform a plurality of MAC operations to generate a first neuron value at a first accumulator and a second neuron value at a second accumulator, wherein the first portion includes:
receiving a first subset of the input data from a circular buffer at inputs of the plurality of parallel MAC units;
performing the plurality of parallel MAC operations using a first set of weights, while holding the inputs of the plurality of parallel MAC units stable at the first subset of the input data, to generate a first portion of the first neuron value at the first accumulator;
performing the plurality of parallel MAC operations using a second set of weights, while holding the inputs of the plurality of parallel MAC units stable at the first subset of the input data, to generate a second portion of the second neuron value at the second accumulator;
repeating receiving subsets of input data from the circular buffer and performing the plurality of parallel MAC operations while holding the inputs of the plurality of parallel MAC units stable to accumulate the first neuron value at the first accumulator and the second neuron value at the second accumulator for all input data;
selecting the first neuron value before a bias and an activation function is applied, using a multiplexer coupled to the plurality of parallel MAC units; and
writing the first neuron value to the memory; and
performing a second portion of processing for the neural network, the second portion implemented in software by a processor, the accelerator and the processor using a bus to communicate and to share access to the memory, wherein the second portion includes:
reading the first neuron value from the memory; and
applying the bias and the activation function to the first neuron value.