US 12,379,933 B2
Ultra pipelined accelerator for machine learning inference
Titash Rakshit, Austin, TX (US); Malik Aqeel Anwar, Atlanta, GA (US); and Ryan Hatcher, Austin, TX (US)
Assigned to Samsung Electronics Co., Ltd., Yongin-si (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Apr. 2, 2020, as Appl. No. 16/838,971.
Claims priority of provisional application 62/934,355, filed on Nov. 12, 2019.
Claims priority of provisional application 62/927,544, filed on Oct. 29, 2019.
Claims priority of provisional application 62/926,292, filed on Oct. 25, 2019.
Prior Publication US 2021/0124588 A1, Apr. 29, 2021
Int. Cl. G06F 9/38 (2018.01); G06F 17/16 (2006.01); G06N 3/02 (2006.01); G06T 1/20 (2006.01)
CPC G06F 9/3867 (2013.01) [G06F 17/16 (2013.01); G06N 3/02 (2013.01); G06T 1/20 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A method of pipelining inference of a neural network comprising a plurality of layers comprising an i-th layer (i being an integer greater than zero), an (i+1)-th layer, and an (i+2)-th layer, the method comprising:
processing, by a controller, a first set of i-th values of the i-th layer via an i-th composite filter to generate (i+1)-th values for the (i+1)-th layer;
determining, by the controller, a quantity of the (i+1)-th values as being sufficient for processing by an (i+1)-th filter associated with the (i+1)-th layer, based on generating the (i+1)-th values to correspond to each unit being operated on by the (i+1)-th filter, the i-th composite filter comprising a first i-th filter and a second i-th filter, and generating more than one (i+1)-th value per clock cycle based on the second i-th filter being offset from the first i-th filter on the i-th layer; and
in response to the determining, processing, by the controller, the (i+1)-th values via the (i+1)-th filter to generate an output value for the (i+2)-th layer while concurrently processing a second set of i-th values of the i-th layer via the i-th composite filter.