US 11,922,297 B2
Edge AI accelerator service
Tiejun Chen, Beijing (CN); Hong Yue, Beijing (CN); Yinghua Chen, Beijing (CN); Yuxin Kou, Beijing (CN); and Shreekanta Das, San Jose, CA (US)
Assigned to VMware, Inc., Palo Alto, CA (US)
Filed by VMware LLC, Palo Alto, CA (US)
Filed on Apr. 1, 2020, as Appl. No. 16/837,385.
Prior Publication US 2021/0312271 A1, Oct. 7, 2021
Int. Cl. G06N 3/065 (2023.01); G06F 9/38 (2018.01); G06F 13/38 (2006.01); H04L 43/10 (2022.01); H04L 45/02 (2022.01)
CPC G06N 3/065 (2023.01) [G06F 9/3877 (2013.01); G06F 13/382 (2013.01); H04L 43/10 (2013.01); H04L 45/026 (2013.01); G06F 2213/0026 (2013.01); G06F 2213/0042 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
at least one computing device comprising at least one processor and a data store; and
the data store comprising executable instructions, wherein the instructions, when executed by the at least one processor, cause the at least one computing device to at least:
transmit, from an artificial intelligence (AI) client to an AI broker service, a plurality of periodically-transmitted AI accelerator heartbeat messages that enable tracking availability statuses for at least one AI accelerator connected to an edge device executing the AI client, wherein an AI accelerator heartbeat message comprises: an AI accelerator device identifier of the AI accelerator, a unique device identifier of the edge device to which the AI accelerator is connected, a hardware address of the AI accelerator, an address of the edge device, and an AI technique type used by the AI accelerator, wherein the AI accelerator heartbeat message registers the AI accelerator with the AI broker service, wherein the AI accelerator is locally connected to a bus of the edge device;
receive, by the AI client from an AI agent executed using a networked computing device, an AI processing request comprising the AI accelerator device identifier of the AI accelerator and the AI technique type, wherein the AI agent selects the AI accelerator for an AI workload based at least in part on the AI accelerator being associated with a shortest physical distance among a plurality of available AI accelerators identified in a list of AI accelerators received from the AI broker service;
enable a bus redirect of the AI accelerator that intercepts bus traffic for the AI accelerator and redirects the bus traffic from the edge device to the networked computing device over a network; and
perform, using the AI technique type specified in the AI processing request, the AI workload on the AI accelerator connected to the edge device, the AI accelerator being controlled by the networked computing device using the bus redirect.