CPC G06F 17/16 (2013.01) [G06F 7/483 (2013.01); G06F 7/4876 (2013.01); G06F 9/30014 (2013.01); G06N 3/02 (2013.01); G06N 3/048 (2023.01); G06N 3/063 (2013.01)] | 21 Claims |
1. A method of performing a matrix multiplication using a hardware circuit, the method comprising:
obtaining, by a matrix computation unit of the hardware circuit, an activation input value and a weight input value, the activation input value and the weight input value each having a first floating point format, wherein the hardware circuit is configured to perform computations for a neural network having a plurality of layers, wherein the activation input value and the weight input value are associated with a layer of the plurality of layers;
wherein the first floating point format is a 16-bit format, comprising: one available bit for a sign, eight available bits for an exponent, and seven available bits for a significand to represent a floating point in the first floating point format;
multiplying, by a multiplication circuitry of the matrix computation unit, the activation input value and the weight input value to generate a product value, the product value having a second floating point format, wherein the second floating point format is different from and has a higher precision than the first floating point format;
obtaining, by the matrix computation unit, a partial sum value in a third floating point format, wherein the third floating point format is different from the first floating point format and the second floating point format and has a higher precision than the first floating point format; and
combining, by a summation circuitry of the hardware circuit, at least the partial sum value and the product value to generate an updated partial sum value that has the third floating point format.
|