US 12,487,746 B2
Speculative remote memory operation tracking for efficient memory barrier
Raymond Hoi Man Wong, Palo Alto, CA (US); Debajit Bhattacharya, San Jose, CA (US); Michael Allen Parker, San Jose, CA (US); and Wishwesh Anil Gandhi, Sunnyvale, CA (US)
Assigned to NVIDIA CORPORATION, Santa Clara, CA (US)
Filed by NVIDIA CORPORATION, Santa Clara, CA (US)
Filed on Nov. 17, 2022, as Appl. No. 17/989,129.
Claims priority of provisional application 63/330,723, filed on Apr. 13, 2022.
Prior Publication US 2023/0333746 A1, Oct. 19, 2023
Int. Cl. G06F 3/06 (2006.01)
CPC G06F 3/0613 (2013.01) [G06F 3/0659 (2013.01); G06F 3/067 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for performing memory operation tracking in a multiprocessor computing system, the method comprising: receiving a first plurality of acknowledgements responsive to a first plurality of memory operations generated subsequent to a first memory synchronization operation and prior to a second memory synchronization operation by a first requesting source included in a plurality of requesting sources, wherein the first memory synchronization operation and each memory operation included in the first plurality of memory operations are identified by a first group identifier; coalescing the first plurality of acknowledgements with a second plurality of acknowledgments responsive to a second plurality of memory operations generated subsequent to the first memory synchronization operation and prior to the second memory synchronization operation by the first requesting source into a coalesced acknowledgement, wherein each memory operation included in the second plurality of memory operations is identified by the first group identifier; subsequent to coalescing the first plurality of acknowledgements with the second plurality of acknowledgments, receiving the second memory synchronization operation corresponding to at least one of the first plurality of memory operations or the second plurality of memory operations; determining that a sum of the memory operations included in the first plurality of memory operations and the memory operations included in the second plurality of memory operations exceeds a threshold number for the first requesting source and the first group identifier; and responsive to determining that the sum of the memory operations included in the first plurality of memory operations and the memory operations included in the second plurality of memory operations exceeds the threshold number for the first requesting source and the first group identifier, transmitting the coalesced acknowledgement to the first requesting source.