US 12,361,282 B2
Optimizing operator granularity in compiling and converting artificial intelligence (AI) models
Li Cao, Beijing (CN); Zhan Peng Huo, Beijing (CN); WeiFeng Zhang, Shenzhen (CN); Wei Cui, Beijing (CN); Fei Fei Li, Huang Pu (CN); Ren Jie Feng, Shanghai (CN); Han Su, Shanghai (CN); and Zhong Hao Wang, Dalian (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on May 27, 2021, as Appl. No. 17/331,888.
Prior Publication US 2022/0383095 A1, Dec. 1, 2022
Int. Cl. G06N 3/08 (2023.01); G06F 8/41 (2018.01); G06N 3/044 (2023.01)
CPC G06N 3/08 (2013.01) [G06F 8/43 (2013.01); G06F 8/447 (2013.01); G06F 8/451 (2013.01); G06N 3/044 (2023.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
extracting, by one or more processors, operator granularity from an artificial intelligence framework and an original model;
receiving, by one or more processors, device characteristics from a target device, the target device for deploying an inferencing service;
generating, by one or more processors, a converter granularity level to a converter based on the operator granularity and the device characteristics;
generating, by one or more processors, a compiler granularity level to a compiler based on the operator granularity and the device characteristics; and
deploying, by one or more processors, the inferencing service on the target device based on the converter granularity level and the compiler granularity level, wherein the inferencing service has a different operator granularity than the original model.