US 11,748,106 B2
Data operations and finite state machine for machine learning via bypass of computational tasks based on frequently-used data values
Liwei Ma, Beijing (CN); Nadathur Rajagopalan Satish, Santa Clara, CA (US); Jeremy Bottleson, Rancho Cordova, CA (US); Farshad Akhbari, Chandler, AZ (US); Eriko Nurvitadhi, Hillsboro, OR (US); Abhishek R. Appu, El Dorado Hills, CA (US); Altug Koker, El Dorado Hills, CA (US); Kamal Sinha, Rancho Cordova, CA (US); Joydeep Ray, Folsom, CA (US); Balaji Vembu, Folsom, CA (US); Vasanth Ranganathan, El Dorado Hills, CA (US); and Sanjeev Jahagirdar, Folsom, CA (US)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Mar. 1, 2022, as Appl. No. 17/683,564.
Application 17/683,564 is a continuation of application No. 15/482,798, filed on Apr. 9, 2017, granted, now 11,269,643.
Prior Publication US 2022/0253317 A1, Aug. 11, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/38 (2018.01)
CPC G06F 9/3832 (2013.01) 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a graphics processor comprising computation circuitry to:
implement a frequently-used data value (FDV) configuration that is to identify a plurality of FDVs, wherein the FDV configuration is to provide a data list consisting of a set of data values defined as FDVs;
apply the FDV configuration to input data received at the computation circuitry, the input data to be used in computational tasks executed by the computation circuitry;
identify, based on the FDV configuration, occurrence in the input data of defined FDVs from the set of data values and of one or more non-FDVs, wherein the one or more non-FDVs consist of other data values that are not comprised in the set of data values;
for the identified FDVs of the input data, cause the identified FDVs to bypass the computational tasks; and
for the one or more non-FDVs of the input data, cause the one or more non-FDVs to be processed by a finite state machine (FSM) implemented by the computation circuitry, wherein the FSM is to:
provide a common primitive for convolution and full connection computation for the computational tasks;
combine memory read accesses; and
merge two or more mathematical operations of the computational tasks.