| CPC G06F 7/5443 (2013.01) [G06F 1/3237 (2013.01)] | 20 Claims |

|
1. An integrated circuit device comprising:
an array of processing elements arranged in rows and columns, wherein each processing element includes:
a weight register configured to store a floating-point weight value that supports a plurality of data types;
a first weight clock-gate circuit configured to clock-gate a first portion of the weight register independently from rest of the weight register based on a clock enable signal, wherein the first portion of the weight register is configured to store a first group of weight data bits, wherein the clock enable signal is generated by combining results of comparing each bit stored in the first portion of the weight register with a corresponding input bit of an input to the first portion of the weight register, and wherein the first group of weight data bits is unused for a first data type, and includes both used and unused bits for a second data type;
a feature map (FMAP) register configured to store a floating-point FMAP value that supports the plurality of data types;
a first FMAP clock-gate circuit configured to clock-gate a first portion of the FMAP register independently from rest of the FMAP register, wherein the first portion of the FMAP register is configured to store a first group of FMAP data bits that are unused for a second data type of the plurality of data types;
a multiplier configured to multiply the floating-point FMAP value with the floating-point weight value to generate a multiplication result; and
an adder configured to add the multiplication result to a partial sum input to generate a partial sum output.
|