US 12,015,526 B2
Mixed-precision neural networks
Thomas Pennello, Mountain View, CA (US)
Assigned to Synopsys, Inc., Sunnyvale, CA (US)
Filed by Synopsys, Inc., Mountain View, CA (US)
Filed on Apr. 30, 2021, as Appl. No. 17/246,156.
Claims priority of provisional application 63/030,300, filed on May 26, 2020.
Prior Publication US 2021/0377122 A1, Dec. 2, 2021
Int. Cl. H04L 41/0896 (2022.01); G06F 16/22 (2019.01); G06N 3/08 (2023.01); G06N 20/00 (2019.01)
CPC H04L 41/0896 (2013.01) [G06F 16/2219 (2019.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01)] 17 Claims
OG exemplary drawing
 
1. A method comprising:
receiving a target bandwidth increase for a machine learning (ML) model comprising a plurality of objects of a first data type represented by a first number of bits, wherein the target bandwidth increase relates to changing a first portion of the plurality of objects to a second data type represented by a second number of bits different from the first number of bits and changing a second portion of the plurality of objects to a third data type represented by a third number of bits different from both the first and second numbers of bits;
sorting the plurality of objects in the ML model based on bandwidth, comprising:
identifying, for a first object of the plurality of objects in the ML model, a total bandwidth of all instances of the first object in the ML model;
identifying both: (i) the first portion of the plurality of objects in the ML model to change from the first data type to the second data type and (ii) the second portion of the plurality of objects in the ML model to change from the first data type to the third data type, based on the target bandwidth increase and the sorting of the plurality of objects, comprising:
selecting both the first portion and the second portion, from among the plurality of objects, based on maintaining a total bandwidth for the ML model at or below the target bandwidth increase and using the sorted plurality of objects; and
changing, by a processor and based on the selecting both the first portion and the second portion, the first portion of the plurality of objects from the first
data type to the second data type and the second portion of the plurality of objects from the first data type to the third data type.