US 11,768,686 B2
Out of order memory request tracking structure and technique
Michael A Fetterman, Lancaster, MA (US); Mark Gebhart, Round Rock, TX (US); Shirish Gadre, Fremont, CA (US); Mitchell Hayenga, Sunnyvale, CA (US); Steven Heinrich, Madison, AL (US); Ramesh Jandhyala, Austin, TX (US); Raghavan Madhavan, Cary, NC (US); Omkar Paranjape, Austin, TX (US); James Robertson, Austin, TX (US); and Jeff Schottmiller, Raleigh, NC (US)
Assigned to NVIDIA Corporation, Santa Clara, CA (US)
Filed by NVIDIA Corporation, Santa Clara, CA (US)
Filed on Jul. 27, 2020, as Appl. No. 16/940,363.
Prior Publication US 2022/0027160 A1, Jan. 27, 2022
Int. Cl. G06F 9/38 (2018.01); G06F 9/30 (2018.01); G06F 12/084 (2016.01); G06F 12/0873 (2016.01); G06F 9/54 (2006.01); G06F 12/0842 (2016.01); G06F 12/0846 (2016.01); G06F 5/06 (2006.01)
CPC G06F 9/3836 (2013.01) [G06F 5/065 (2013.01); G06F 9/30047 (2013.01); G06F 9/3867 (2013.01); G06F 9/546 (2013.01); G06F 12/084 (2013.01); G06F 12/0842 (2013.01); G06F 12/0846 (2013.01); G06F 12/0873 (2013.01); G06F 2212/1021 (2013.01)] 36 Claims
OG exemplary drawing
 
1. A memory request tracking circuit for use with a streaming cache memory configured to receive memory requests for data in a memory system and return memory system data in response to the received memory requests, the memory request tracking circuit comprising:
a tag check configured to detect misses of the streaming cache memory;
plural tracking queues each configured to maintain miss traffic in first-in-first-out order; and
a queue mapper coupled to the tag check and the plural tracking queues, the queue mapper being configured to provide plural memory request tracking information entries for miss traffic to the plural tracking queues to enable in-order and out-of-order memory request returns, the queue mapper being configured to distribute a first subset of the plural memory request tracking information entries to the same tracking queue to enable in-order memory request returns for the first subset and to distribute a second subset of the plural memory request tracking information entries across plural tracking queues to enable out-of-order memory request returns for the second subset,
wherein the queue mapper is configured to distribute the out-of-order memory request returns across plural tracking queues to reduce the chance that any individual long-latency access will block a number of other accesses, thereby enabling a consuming ray tracer to make forward progress when individual long-latency accesses occur.