US 12,463,875 B2
	Apparatus, articles of manufacture, and methods to partition neural networks for execution at distributed edge nodes
Karthik Kumar, Chandler, AZ (US); and Francesc Guim Bernat, Barcelona (ES)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Dec. 17, 2021, as Appl. No. 17/554,964.
Prior Publication US 2022/0109742 A1, Apr. 7, 2022
Int. Cl. G06F 15/177 (2006.01); G06N 3/04 (2023.01); H04L 41/5019 (2022.01); H04L 41/5054 (2022.01); H04L 67/00 (2022.01); H04L 43/08 (2022.01); H04L 67/10 (2022.01)

CPC H04L 41/5054 (2013.01) [G06N 3/04 (2013.01); H04L 41/5019 (2013.01); H04L 67/34 (2013.01); H04L 43/08 (2013.01); H04L 67/10 (2013.01)]

24 Claims

1. An apparatus to partition a neural network model comprising:

interface circuitry to communicate with an edge device; and

processor circuitry including one or more of:

at least one of a central processing unit, a graphic processing unit, or a digital signal processor, the at least one of the central processing unit, the graphic processing unit, or the digital signal processor having control circuitry to control data movement within the processor circuitry, arithmetic and logic circuitry to perform one or more first operations corresponding to instructions, and one or more registers to store a first result of the one or more first operations, the instructions in the apparatus;

a Field Programmable Gate Array (FPGA), the FPGA including first logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the first logic gate circuitry and interconnections to perform one or more second operations, the storage circuitry to store a second result of the one or more second operations; or

Application Specific Integrate Circuitry (ASIC) including second logic gate circuitry to perform one or more third operations;

the processor circuitry to perform at least one of the first operations, the second operations, or the third operations to instantiate:

power consumption estimation circuitry to estimate a computation energy consumption for executing the neural network model on a first edge node, the neural network model corresponding to a platform service with a service level agreement timeframe;

network bandwidth determination circuitry to determine a transmission time for sending an intermediate result from the first edge node to a second edge node or a third edge node;

the power consumption estimation circuitry to estimate a transmission energy consumption for sending the intermediate result of the neural network model to the second edge node or the third edge node; and

neural network partitioning circuitry to partition the neural network model into a first portion to be executed at the first edge node and a second portion to be executed at the second edge node or the third edge node based on at least one of the service level agreement timeframe for the platform service, the computation energy consumption, the transmission energy consumption, or the transmission time.