US 12,223,427 B2
Real time context dependent deep learning
Lev Faivishevsky, Kfar Saba (IL); Tomer Bar-On, Petah Tikva (IL); Yaniv Fais, Tel Aviv (IL); Jacob Subag, Kiryat Haim (IL); Jeremie Dreyfuss, Tel Aviv (IL); Amit Bleiweiss, Yad Binyamin (IL); Tomer Schwartz, Even Yehuda (IL); Raanan Yonatan Yehezkel Rohekar, Kiryat Ekron (IL); Michael Behar, Zichron Yaakov (IL); Amitai Armon, Tel-Aviv (IL); and Uzi Sarel, Zichron-Yaakov (IL)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on May 30, 2023, as Appl. No. 18/325,744.
Application 18/325,744 is a continuation of application No. 17/404,153, filed on Aug. 17, 2021, granted, now 11,704,564.
Application 17/404,153 is a continuation of application No. 15/494,887, filed on Apr. 24, 2017, granted, now 11,238,338, issued on Feb. 1, 2022.
Prior Publication US 2023/0394305 A1, Dec. 7, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/08 (2023.01); G06N 20/00 (2019.01); G06N 20/10 (2019.01)
CPC G06N 3/08 (2013.01) [G06N 20/00 (2019.01); G06N 20/10 (2019.01)] 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a general purpose graphics processing unit (GPGPU) comprising a plurality of streaming multiprocessors (SMs), the GPGPU to:
receive data inputs for training a neural network executed by the plurality of SMs, wherein the data inputs comprise training data and weights inputs;
perform measurements of latency of the plurality of SMs;
determine a ratio of the latency for the plurality of SMs; and
assign, in accordance with the ratio of the latency, the training data in a low precision form and assign the weights inputs in a high precision form among the plurality of SMs, wherein the low precision form is lower than the high precision form.