US 11,886,984 B2
Variable precision and mix type representation of multiple layers in a network
Uzi Sarel, Zichron-Yaakov (IL); Ehud Cohen, Kiryat Matskin (IL); Tomer Schwartz, Even Yehuda (IL); Amitai Armon, Tel Aviv (IL); Yahav Shadmiy, Ra mat Gan (IL); Amit Bleiweiss, Yad Binyamin (IL); Gal Leibovich, Kiryat Yam (IL); Jeremie Dreyfuss, Tel-Aviv (IL); Lev Faivishevsky, Kfar Saba (IL); Tomer Bar-On, Petah Tikva (IL); Yaniv Fais, Tel-Aviv (IL); and Jacob Subag, Kiryat Haim (IL)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Aug. 10, 2021, as Appl. No. 17/398,302.
Application 17/398,302 is a continuation of application No. 15/499,896, filed on Apr. 28, 2017, granted, now 11,093,822.
Prior Publication US 2022/0067496 A1, Mar. 3, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/063 (2023.01); G06N 3/084 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06F 9/30 (2018.01)
CPC G06N 3/063 (2013.01) [G06F 9/30014 (2013.01); G06F 9/30025 (2013.01); G06F 9/30043 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a processor to:
expose embedded cast operations in at least one of a load instruction or a store instruction of a stream of instructions;
determine, for each layer of a multi-layer deep learning neural network (DNN), a target precision level for the cast operations at each layer and data types of a plurality of different data types for the cast operations at each layer, wherein the target precision level for the cast operations at each layer is determined from the plurality of different data types that are used to represent various weights in different layers of the multi-layer DNN, and wherein high precision floating point data is utilized for a first subset of the different layers, low precision floating point data is utilized for a second subset of the different layers, and integer data is utilized for a third subset of the different layers; and
load the cast operations at the target precision level and the data types determined for the cast operations at each layer of the multi-layer DNN.