US 12,117,962 B2
	Scalar core integration
Joydeep Ray, Folsom, CA (US); Aravindh Anantaraman, Folsom, CA (US); Abhishek R. Appu, El Dorado Hills, CA (US); Altug Koker, El Dorado Hills, CA (US); Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); Valentin Andrei, San Jose, CA (US); Subramaniam Maiyuran, Gold River, CA (US); Nicolas Galoppo Von Borries, Portland, OR (US); Varghese George, Folsom, CA (US); Mike Macpherson, Portland, OR (US); Ben Ashbaugh, Folsom, CA (US); Murali Ramadoss, Folsom, CA (US); Vikranth Vemulapalli, Folsom, CA (US); William Sadler, Folsom, CA (US); Jonathan Pearce, Portland, OR (US); and Sungye Kim, Folsom, CA (US)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Aug. 16, 2023, as Appl. No. 18/450,685.
Application 18/450,685 is a continuation of application No. 17/868,448, filed on Jul. 19, 2022, granted, now 11,762,804.
Application 17/868,448 is a continuation of application No. 17/321,885, filed on May 17, 2021, granted, now 11,409,693, issued on Aug. 9, 2022.
Application 17/321,885 is a continuation of application No. 16/354,782, filed on Mar. 15, 2019, granted, now 11,016,929, issued on May 25, 2021.
Prior Publication US 2024/0045830 A1, Feb. 8, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 1/00 (2006.01); G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06F 15/80 (2006.01); G06T 15/00 (2011.01)

CPC G06F 15/8069 (2013.01) [G06F 9/30163 (2013.01); G06F 9/3877 (2013.01); G06T 15/005 (2013.01); G06F 9/3836 (2013.01)]

18 Claims

1. An apparatus comprising:

a scalar processor complex comprising a plurality of scalar processor cores;

a vector processor complex comprising a plurality of vector processor cores;

a hardware accelerator bank comprising a tensor core to perform matrix processing for deep learning operations using a plurality of operand precisions; and

a pre-processor communicably coupled to the scalar processor complex, the vector processor complex, and the hardware accelerator bank, wherein the pre-processor to:

receive a binary translation of a code segment of a plurality of code segments corresponding to a set of workload instructions for a graphics workload from a host processor;

analyze operations of the binary translation to identify whether the operations are suitable for execution by one of the scalar processor complex, the vector processor complex, or the hardware accelerator bank; and

assign, to the scalar processor complex, the operations of the binary translation that are identified as suitable for execution by the scalar processor complex.