US 12,443,408 B2
	Processing pipeline with zero loop overhead
Kameran Azadet, San Ramon, CA (US); Jeroen Leijten, Hulsel (NL); and Joseph Williams, Holmdel, NJ (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Apr. 26, 2024, as Appl. No. 18/647,891.
Application 18/647,891 is a continuation of application No. 17/131,970, filed on Dec. 23, 2020, granted, now 11,989,554.
Prior Publication US 2024/0345839 A1, Oct. 17, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06F 9/46 (2006.01); G06F 9/54 (2006.01)

CPC G06F 9/30036 (2013.01) [G06F 9/30079 (2013.01); G06F 9/3877 (2013.01); G06F 9/463 (2013.01); G06F 9/546 (2013.01)]

39 Claims

1. A processing pipeline, comprising:

a first processor associated with a first memory; and

a second processor associated with a second memory, the second processor being operably coupled to the first processor via a data interface and being downstream from the first processor in the processing pipeline,

wherein the first processor and the second processor are capable of processing data blocks in accordance with a plurality of data processing loop iterations associated with a commonly-executed function,

wherein the first processor is capable of processing a data block in accordance with a first one of the plurality of data processing loop iterations to provide a processed data block, and storing the processed data block in the first memory,

wherein the second processor receives the processed data block based upon the processed data block being stored in the first memory,

wherein the second processor is capable of processing the processed data block in accordance with a second one of the plurality of data processing loop iterations and storing a result of processing the processed data block in the second memory,

wherein the first processor and the second processor are each capable of processing the data block and the processed data block, respectively, as part of a continuous execution of the first and the second ones of the plurality of data processing loop iterations associated with the commonly-executed function,

wherein the processing pipeline is part of a vector processing pipeline architecture, and

wherein the commonly-executed function is implemented to perform a calculation of filter coefficients.