| CPC G06N 3/08 (2013.01) [G06F 8/43 (2013.01); G06F 8/447 (2013.01); G06F 8/451 (2013.01); G06N 3/044 (2023.01)] | 19 Claims |

|
1. A computer-implemented method comprising:
extracting, by one or more processors, operator granularity from an artificial intelligence framework and an original model;
receiving, by one or more processors, device characteristics from a target device, the target device for deploying an inferencing service;
generating, by one or more processors, a converter granularity level to a converter based on the operator granularity and the device characteristics;
generating, by one or more processors, a compiler granularity level to a compiler based on the operator granularity and the device characteristics; and
deploying, by one or more processors, the inferencing service on the target device based on the converter granularity level and the compiler granularity level, wherein the inferencing service has a different operator granularity than the original model.
|