US 11,941,437 B2
Graph partitioning to exploit batch-level parallelism
Mustafa Cavus, Hillsboro, OR (US); and Yamini Nimmagadda, Portland, OR (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Jun. 25, 2021, as Appl. No. 17/358,751.
Prior Publication US 2021/0318908 A1, Oct. 14, 2021
Int. Cl. G06F 9/50 (2006.01); G06F 9/48 (2006.01); G06F 16/901 (2019.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01)
CPC G06F 9/4881 (2013.01) [G06F 9/5038 (2013.01); G06F 16/9024 (2019.01); G06N 3/04 (2013.01); G06N 3/08 (2013.01)] 25 Claims
OG exemplary drawing
 
1. A computing system, comprising:
a processor; and
a memory coupled to the processor to store instructions which, when executed by the processor, cause the processor to:
partition a graph into a plurality of clusters comprising batched clusters that support batched data and non-batched clusters that fail to support batched data;
establish an execution queue for execution of the plurality of clusters based on cluster dependencies; and
schedule inference execution of the plurality of clusters in the execution queue based on batch size.