US 11,734,179 B2
Efficient work unit processing in a multicore system
Wael Noureddine, Santa Clara, CA (US); Jean-Marc Frailong, Rancho Mirage, CA (US); Felix A. Marti, San Francisco, CA (US); Charles Edward Gray, San Francisco, CA (US); and Paul Kim, Fremont, CA (US)
Assigned to Fungible, Inc., Santa Clara, CA (US)
Filed by Fungible, Inc., Santa Clara, CA (US)
Filed on Jun. 28, 2021, as Appl. No. 17/360,619.
Application 17/360,619 is a continuation of application No. 16/746,344, filed on Jan. 17, 2020, granted, now 11,048,634.
Application 16/746,344 is a continuation of application No. 15/949,692, filed on Apr. 10, 2018, granted, now 10,540,288, issued on Jan. 21, 2020.
Claims priority of provisional application 62/625,518, filed on Feb. 2, 2018.
Prior Publication US 2021/0349824 A1, Nov. 11, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 12/08 (2016.01); G06F 12/0862 (2016.01); G06F 12/0891 (2016.01); G06F 12/0804 (2016.01); G06F 12/0855 (2016.01)
CPC G06F 12/0862 (2013.01) [G06F 12/0804 (2013.01); G06F 12/0855 (2013.01); G06F 12/0891 (2013.01); G06F 2212/154 (2013.01); G06F 2212/6028 (2013.01); G06F 2212/62 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
processing circuitry having a cache, wherein the processing circuitry is configured to process a first stream fragment and generate first stream data in a first cache segment in the cache;
a buffer to store data; and
a load store unit configured to:
determine that a second stream fragment is expected to be processed by the processing circuitry after the first stream fragment,
prefetch data associated with the second stream fragment into a second segment of the cache, wherein at least some of the prefetching occurs before the processing circuitry finishes processing the first stream fragment, and
flush the first cache segment of the cache after the processing circuitry finishes processing the first stream fragment, wherein flushing the first cache segment includes storing the first stream data in the buffer.