US 11,995,737 B2
	Thread scheduling over compute blocks for power optimization
Altug Koker, El Dorado Hills, CA (US); Balaji Vembu, Folsom, CA (US); Joydeep Ray, Folsom, CA (US); James A. Valerio, Hillsboro, OR (US); and Abhishek R. Appu, El Dorado Hills, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Nov. 16, 2021, as Appl. No. 17/527,689.
Application 17/527,689 is a continuation of application No. 16/714,862, filed on Dec. 16, 2019, granted, now 11,227,360.
Application 16/714,862 is a continuation of application No. 15/477,038, filed on Apr. 1, 2017, granted, now 10,521,875, issued on Dec. 31, 2019.
Prior Publication US 2022/0156875 A1, May 19, 2022
Int. Cl. G06T 1/20 (2006.01); G06F 9/50 (2006.01)

CPC G06T 1/20 (2013.01) [G06F 9/5011 (2013.01); Y02D 10/00 (2018.01)]

20 Claims

1. A general-purpose graphics processing unit comprising:

a processing array including multiple compute blocks, each compute block including multiple graphics compute units; and

thread dispatch circuitry configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads, the threads of the 2D thread group associated with memory addresses within a region of memory that includes a first 2D tile of memory and a second 2D tile of memory, the thread dispatch circuitry configured to:

dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with the first 2D tile of memory, wherein the first 2D tile of memory is associated with a first region of a render target; and

dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with the second 2D tile of memory, wherein the second 2D tile of memory is associated with a second region of the render target.