US 12,190,086 B1
Method and apparatus for ML graphs by a compiler
Ulf Hanebutte, Gig Harbor, WA (US); Chien-Chun Chou, Morgan Hill, CA (US); Senad Durakovic, Palo Alto, CA (US); and Pranav Jonnalagadda, San Jose, CA (US)
Assigned to Marvell Asia Pte Ltd, Singapore (SG)
Filed by Marvell Asia Pte Ltd, Singapore (SG)
Filed on May 18, 2022, as Appl. No. 17/747,813.
Application 17/747,813 is a continuation in part of application No. 17/390,143, filed on Jul. 30, 2021, granted, now 11,467,811.
Claims priority of provisional application 63/214,651, filed on Jun. 24, 2021.
Int. Cl. G06F 9/44 (2018.01); G06F 8/41 (2018.01); G06F 16/901 (2019.01); G06N 20/00 (2019.01)
CPC G06F 8/47 (2013.01) [G06F 16/9024 (2019.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a processor; and
a compiler executed by the processor, wherein the compiler is configured to
receive a machine learning (ML) model;
generate a graph associated with the ML model, wherein the graph is an internal representation of the ML model;
partition the graph into a first subgraph and a second subgraph, wherein the first subgraph is associated with an ML hardware, and wherein the second subgraph is associated with a processor different from the ML hardware, wherein the partition is based on at least one of:
a) whether an operation within a node of the graph is supported by the ML hardware, or
b) latency associated with a node within the first subgraph as opposed to the second subgraph, or
c) an amount of data movement if a node is included within the first subgraph as opposed to the second subgraph;
generate a set of low-level instructions associated with the first subgraph; and
identify one or more resources in the ML hardware to execute the set of low-level instructions associated with the first subgraph.