US 12,298,932 B2
Load balancing system for the execution of applications on reconfigurable processors
Milad Sharif, Palo Alto, CA (US); Ravinder Kumar, Fremont, CA (US); Qi Zheng, Fremont, CA (US); Neal Sanghvi, Palo Alto, CA (US); Jiayu Bai, Palo Alto, CA (US); and Arnav Goel, San Jose, CA (US)
Assigned to SambaNova Systems, Inc., Palo Alto, CA (US)
Filed by SambaNova Systems, Inc., Palo Alto, CA (US)
Filed on May 22, 2023, as Appl. No. 18/200,311.
Claims priority of provisional application 63/345,775, filed on May 25, 2022.
Prior Publication US 2023/0388373 A1, Nov. 30, 2023
Int. Cl. G06F 15/78 (2006.01); G06F 9/28 (2006.01); G06F 9/38 (2018.01); G06F 15/173 (2006.01); G06F 15/80 (2006.01)
CPC G06F 15/7871 (2013.01) [G06F 15/17343 (2013.01); G06F 15/7867 (2013.01); G06F 15/7889 (2013.01); G06F 15/80 (2013.01); G06F 9/28 (2013.01); G06F 9/3885 (2013.01)] 18 Claims
OG exemplary drawing
 
18. A non-transitory computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to operate a server in a client-server configuration, the server being part of a data processing system for executing first and second applications that a client in the client-server configuration, coupled to the server, can offload for execution onto the data processing system, the data processing system further comprising a pool of reconfigurable data flow resources coupled to the server that comprises arrays of coarse-grained reconfigurable (CGR) units, that is partitionable into a predetermined number of partitions, wherein each partition of the predetermined number of partitions comprises at least one array of coarse-grained reconfigurable units, and that is configured to execute the first application in a first runtime context and the second application in a second runtime context, the instructions comprising:
establishing a session with the client;
receiving a first execution request for executing the first application from the client;
receiving a second execution request for executing the second application from the client;
in response to receiving the first execution request, starting a first execution of the first application in the first runtime context;
in response to receiving the second execution request, starting a second execution of the second application in the second runtime context; and
balancing a first load from the first execution with a second load from the second execution.