CPC G06F 1/3234 (2013.01) [G06F 9/3001 (2013.01); G06F 9/30105 (2013.01); G06F 9/5016 (2013.01); G06N 3/08 (2013.01); G06F 2209/5011 (2013.01)] | 20 Claims |
1. A method for reducing power consumption in machine learning hardware accelerators, the method comprising:
retrieving input data from a source memory;
using a compute cache to perform one or more arithmetic operations on the input data to obtain a result; and
transferring the result to a hardware accelerator that performs a convolution operation and generates an output without generating, storing, accessing, or retrieving intermediate data, thereby, reducing at least one of read operations or write operations, wherein the hardware accelerator is a separate circuit from the compute cache.
|