US 12,147,890 B2
	Neural network computing device and cache management method thereof
Deming Gu, Shanghai (CN); Wei Zhang, Shanghai (CN); Yuanfeng Wang, Shanghai (CN); and Guixiang He, Shanghai (CN)
Assigned to GlenFly Technology Co., Ltd., Shanghai (CN)
Filed by GlenFly Technology Co., Ltd., Shanghai (CN)
Filed on Aug. 11, 2020, as Appl. No. 16/990,953.
Claims priority of application No. 202010564773.8 (CN), filed on Jun. 19, 2020.
Prior Publication US 2021/0397934 A1, Dec. 23, 2021
Int. Cl. G06N 3/063 (2023.01); G06F 12/0891 (2016.01); G06N 3/08 (2023.01)

CPC G06N 3/063 (2013.01) [G06F 12/0891 (2013.01); G06N 3/08 (2013.01); G06F 2212/60 (2013.01)]

7 Claims

1. A neural network computing device, comprising:

a computing circuit, applicable to perform a neural network calculation, wherein the neural network calculation comprises a first layer calculation and a second layer calculation;

a main memory; and

a cache circuit, coupled to the computing circuit and the main memory, wherein

after the computing circuit completes the first layer calculation and generates a first layer calculation result required for the second layer calculation, the cache circuit retains the first layer calculation result in the cache circuit until the second layer calculation is completed; and

in response to that the second layer calculation is completed, the first layer calculation result used for the second layer calculation is invalidated instantly by the cache circuit to prevent the first layer calculation result from being written into the main memory,

wherein the cache circuit is configured to determine whether a calculation result of a neural network data is a destination lock type, wherein in response to that the calculation result of the neural network data is the destination lock type, the cache circuit ensures that the calculation result of the neural network data is not removed from the cache circuit,

wherein the cache circuit comprises:

a cache memory; and

a cache control circuit, coupled to the computing circuit and the cache memory, wherein

after the computing circuit completes the first layer calculation and generates the first layer calculation result required for the second layer calculation, the cache control circuit retains the first layer calculation result in the cache memory until the second layer calculation is completed;

in response to that the computing circuit performs the second layer calculation after the first layer calculation is completed, the cache control circuit transmits the first layer calculation result from the cache memory to the computing circuit for the second layer calculation to use; and

after the second layer calculation is completed, the cache control circuit invalidates the first layer calculation result retained in the cache memory to prevent the first layer calculation result from being written into the main memory.