CPC H04L 41/0896 (2013.01) [G06F 16/2219 (2019.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01)] | 17 Claims |
1. A method comprising:
receiving a target bandwidth increase for a machine learning (ML) model comprising a plurality of objects of a first data type represented by a first number of bits, wherein the target bandwidth increase relates to changing a first portion of the plurality of objects to a second data type represented by a second number of bits different from the first number of bits and changing a second portion of the plurality of objects to a third data type represented by a third number of bits different from both the first and second numbers of bits;
sorting the plurality of objects in the ML model based on bandwidth, comprising:
identifying, for a first object of the plurality of objects in the ML model, a total bandwidth of all instances of the first object in the ML model;
identifying both: (i) the first portion of the plurality of objects in the ML model to change from the first data type to the second data type and (ii) the second portion of the plurality of objects in the ML model to change from the first data type to the third data type, based on the target bandwidth increase and the sorting of the plurality of objects, comprising:
selecting both the first portion and the second portion, from among the plurality of objects, based on maintaining a total bandwidth for the ML model at or below the target bandwidth increase and using the sorted plurality of objects; and
changing, by a processor and based on the selecting both the first portion and the second portion, the first portion of the plurality of objects from the first
data type to the second data type and the second portion of the plurality of objects from the first data type to the third data type.
|