US 12,456,035 B2
Neural networks graph partitioning system and method for the same
Wei Zuo, Cupertino, CA (US); Qiang Zhang, Campbell, CA (US); Chenhao Fang, Sunnyvale, CA (US); and Zheng Qi, Cupertino, CA (US)
Assigned to Black Sesame Technologies Inc., San Jose, CA (US)
Filed by Black Sesame Technologies Inc., San Jose, CA (US)
Filed on Dec. 8, 2021, as Appl. No. 17/545,799.
Prior Publication US 2023/0177311 A1, Jun. 8, 2023
Int. Cl. G06N 3/045 (2023.01); G06F 17/18 (2006.01)
CPC G06N 3/045 (2023.01) [G06F 17/18 (2013.01)] 7 Claims
OG exemplary drawing
 
1. A method for partitioning a neural network-based graph into a plurality of sub-graphs using a cost function based parameter search, the method comprising:
listing a white list and a black list with a partition boundary with respect to the neural network-based graph;
applying hard cuts of graph boundaries for generating multiple nodes in the black list and the white list, wherein the nodes in the black list represent a partition boundary and the nodes in the white list cannot be considered as graph partition boundary;
grouping the multiple nodes in the white list;
partitioning between the plurality of nodes in the neural network-based graph based on a plurality of cost functions;
generating multiple partition paths sorted by the plurality of cost functions between the plurality of nodes, wherein a cost function algorithm generates the cost between the nodes starting from the first node and looping through all the remaining N-1 nodes;
assigning at least one assignment workflow to the sub-graph upon selection of scheduling parameters for partitioning the neural network-based graph into the plurality of sub-graphs, wherein the assigning at least one assignment workflow is applied to accelerators and chips including a DSP chip, a neural networks chip, or an AI chip; and
reducing power consumption of the accelerators and the chips due to the plurality of sub-graphs.