US 12,032,496 B2
Efficient data sharing for graphics data processing operations
Joydeep Ray, Folsom, CA (US); Altug Koker, El Dorado Hills, CA (US); Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); Michael Macpherson, Portland, OR (US); Aravindh V. Anantaraman, Folsom, CA (US); Vasanth Ranganathan, El Dorado Hills, CA (US); Lakshminarayanan Striramassarma, El Dorado Hills, CA (US); Varghese George, Folsom, CA (US); Abhishek Appu, El Dorado Hills, CA (US); and Prasoonkumar Surti, Folsom, CA (US)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Jul. 25, 2023, as Appl. No. 18/358,550.
Application 18/358,550 is a continuation of application No. 17/212,503, filed on Mar. 25, 2021, granted, now 11,755,501.
Claims priority of provisional application 63/000,784, filed on Mar. 27, 2020.
Prior Publication US 2024/0012767 A1, Jan. 11, 2024
Int. Cl. G06F 13/16 (2006.01); G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06F 9/50 (2006.01); G06T 1/20 (2006.01); G06T 1/60 (2006.01)
CPC G06F 13/1605 (2013.01) [G06F 9/3004 (2013.01); G06F 9/3887 (2013.01); G06F 9/5016 (2013.01); G06T 1/20 (2013.01); G06T 1/60 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a graphic processing unit (GPU) having a processing resource comprising:
arithmetic logic units (ALUs) comprising vector ALUs and scalar ALUs; and
a set of register files, the set of register files comprising:
a vector register bank comprising set of vector registers; and
a scalar register bank comprising a set of scalar registers;
wherein the processing resource is to:
determine that a same constant is loaded by more than one thread executed by the processing resource;
load the constant in a scalar register of the set of scalar registers in the scalar register bank; and
cause threads executed by the processing resource to access the scalar register storing the constant using one or more of the scalar ALUs.
 
10. A method comprising:
allocating, by a processing resource of a graphics processor, a set of scalar registers in a scalar register bank of the graphics processor, wherein the graphics processor comprises the scalar register bank and a vector register bank;
determining, by the processing resource, that a same constant is loaded by more than one thread executed by the processing resource;
loading, by the processing resource, the constant in a scalar register of the set of scalar registers in the scalar register bank; and
causing, by the processing resource, threads executed by the processing resource to access the scalar register storing the constant using one or more of the scalar arithmetic logic units (ALUs).
 
16. A non-transitory computer-readable medium having instructions stored thereon, which when executed by one or more processors, cause the one or more processors to:
allocate, by a processing resource of the one or more processors comprising a graphics processor, a set of scalar registers in a scalar register bank of the graphics processor, wherein the graphics processor comprises the scalar register bank and a vector register bank;
determine, by the processing resource, that a same constant is loaded by more than one thread executed by the processing resource;
load, by the processing resource, the constant in a scalar register of the set of scalar registers in the scalar register bank; and
cause, by the processing resource, threads executed by the processing resource to access the scalar register storing the constant using one or more of the scalar arithmetic logic units (ALUs).