| CPC G06N 3/063 (2013.01) [G06F 9/5083 (2013.01); G06F 9/545 (2013.01); G06N 5/04 (2013.01); G06F 9/5016 (2013.01); G06N 3/04 (2013.01); G06N 3/06 (2013.01)] | 20 Claims |

|
1. A method comprising performing on an electronic device comprising circuit engines configured to evaluate a particular type of machine learning model:
receiving, at a first system routine from a first client application, a provisioning request indicating that the first client application includes first code for evaluating the particular type of machine learning model, wherein the first system routine executes in user space of memory on the electronic device;
provisioning the first code for execution on one or more of the circuit engines;
receiving, at a second system routine, an inference request from the first client application for evaluating the particular type of machine learning model, the inference request including first input data upon which the particular type of machine learning model is evaluated, wherein the second system routine executes in kernel space of memory on the electronic device;
receiving, at the second system routine, information about a current status and a historical performance of the circuit engines;
assigning, by the second system routine, the inference request to one or more of the circuit engines based on the information;
evaluating, using the one or more of the circuit engines, the inference request; and
providing a result of the inference request to the first client application.
|