CPC G06F 15/167 (2013.01) | 20 Claims |
1. A computer-implemented method comprising:
executing an append operation using a plurality of threads on a plurality of streaming multiprocessors of a graphical processing unit, wherein:
each streaming multiprocessor within the plurality of streaming multiprocessors has an associated shared memory;
each shared memory is partitioned into a plurality of write combine buffers (WCBs);
a global memory is accessible by the plurality of streaming multiprocessors;
the append operation writes results into a result buffer in the global memory;
executing the append operation comprises:
claiming, by a given thread within the plurality of threads having a result to write, a portion of a selected WCB in shared memory;
writing, by the given thread, the result to the portion of the selected WCB; and
in response to a flush condition being met for the selected WCB, copying contents of the selected WCB to the result buffer in global memory.
|