US 12,067,641 B2
Page faulting and selective preemption
Altug Koker, El Dorado Hills, CA (US); Ingo Wald, Salt Lake City, UT (US); David Puffer, Tempe, AZ (US); Subramaniam M. Maiyuran, Gold River, CA (US); Prasoonkumar Surti, Folsom, CA (US); Balaji Vembu, Folsom, CA (US); Guei-Yuan Lueh, San Jose, CA (US); Murali Ramadoss, Folsom, CA (US); Abhishek R. Appu, El Dorado Hills, CA (US); and Joydeep Ray, Folsom, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on May 20, 2022, as Appl. No. 17/749,266.
Application 16/924,895 is a division of application No. 16/293,044, filed on Mar. 5, 2019, granted, now 10,726,517, issued on Jul. 28, 2020.
Application 16/293,044 is a division of application No. 15/482,808, filed on Apr. 9, 2017, granted, now 10,282,812, issued on May 7, 2019.
Application 17/749,266 is a continuation of application No. 16/924,895, filed on Jul. 9, 2020, granted, now 11,354,769.
Prior Publication US 2022/0351325 A1, Nov. 3, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 1/20 (2006.01); G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06F 9/46 (2006.01); G06F 9/48 (2006.01)
CPC G06T 1/20 (2013.01) [G06F 9/3009 (2013.01); G06F 9/30185 (2013.01); G06F 9/3851 (2013.01); G06F 9/461 (2013.01); G06F 9/4843 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A general-purpose graphics processing unit (GPGPU) comprising:
a host interface;
a memory interface;
a processing array coupled with the host interface and the memory interface, the processing array including multiple processing clusters to perform parallel operations, the processing array configured to address memory accessed via the memory interface via a virtual address mapping and includes circuitry to resolve a page fault for the virtual address mapping, wherein each of the multiple processing clusters is separately preemptable and is associated with a dedicated region of context save memory; and
a scheduler to schedule a workload to the multiple processing clusters, the scheduler configured to track an average latency to resolve a page fault, enable page fault preemption for a first context in response to a determination that the average latency to resolve a page fault for the first context is above a high-watermark threshold, and disable page fault preemption for the first context in response to a determination that the average latency to resolve a page fault for the first context is below a low-watermark threshold that is different from the high-watermark threshold, wherein to preempt a processing cluster includes to halt execution at an instruction boundary of a first plurality of threads of a first context during execution of the first plurality of threads, save context state associated with the first plurality of threads to the dedicated region of context save memory, and replace the first plurality of threads of the first context with a second plurality of threads of a second context.