US 12,175,252 B2
Concurrent multi-datatype execution within a processing resource
Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); Barath Lakshmanan, Chandler, AZ (US); Tatiana Shpeisman, Menlo Park, CA (US); Joydeep Ray, Folsom, CA (US); Ping T. Tang, Edison, NJ (US); Michael Strickland, Sunnyvale, CA (US); Xiaoming Chen, Shanghai (CN); Anbang Yao, Beijing (CN); Ben J. Ashbaugh, Folsom, CA (US); Linda L. Hurd, Cool, CA (US); and Liwei Ma, Beijing (CN)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Jun. 14, 2022, as Appl. No. 17/839,856.
Application 17/839,856 is a continuation of application No. 15/819,167, filed on Nov. 21, 2017, granted, now 11,409,537.
Application 15/819,167 is a continuation of application No. 15/494,773, filed on Apr. 24, 2017, granted, now 10,409,614, issued on Sep. 10, 2019.
Prior Publication US 2022/0382555 A1, Dec. 1, 2022
Int. Cl. G06F 9/38 (2018.01); G06F 9/30 (2018.01); G06F 9/50 (2006.01); G06F 13/40 (2006.01); G06F 13/42 (2006.01); G06F 15/80 (2006.01); G06N 3/00 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2023.01); G06N 3/084 (2023.01); G06N 20/00 (2019.01); G06N 20/10 (2019.01); G06T 1/20 (2006.01)
CPC G06F 9/3887 (2013.01) [G06F 9/3001 (2013.01); G06F 9/30014 (2013.01); G06F 9/30036 (2013.01); G06F 9/30094 (2013.01); G06F 9/30109 (2013.01); G06F 9/30112 (2013.01); G06F 9/3016 (2013.01); G06F 9/3851 (2013.01); G06F 9/3891 (2013.01); G06F 9/50 (2013.01); G06F 13/4068 (2013.01); G06F 13/4282 (2013.01); G06F 15/80 (2013.01); G06N 3/00 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2013.01); G06N 3/084 (2013.01); G06N 20/00 (2019.01); G06N 20/10 (2019.01); G06T 1/20 (2013.01); G06F 2213/0026 (2013.01)] 24 Claims
OG exemplary drawing
 
1. A graphics processing unit (GPU) comprising:
a processing cluster comprising a plurality of multiprocessors interconnected via a data crossbar, the plurality of multiprocessors configured to distribute processed data among the plurality of multiprocessors directly via the data crossbar, from a first multiprocessor of the plurality of multiprocessors to a second multiprocessor of the plurality of multiprocessors, wherein a multiprocessor of the plurality of multiprocessors comprises:
an instruction cache to store a first instruction and a second instruction, the first instruction to cause the multiprocessor to perform a floating-point operation and the second instruction to cause the multiprocessor to perform an integer operation; and
a plurality of general-purpose graphics compute units having a single instruction, multiple thread architecture, the plurality of general-purpose graphics compute units including a first general-purpose graphics compute unit to execute the first instruction concurrently with execution of the second instruction by a second general-purpose graphics compute unit.