CPC G06F 8/41 (2013.01) [G06F 8/453 (2013.01); G06F 8/457 (2013.01)] | 16 Claims |
1. A method of orchestrating data movements of a program on a multi-execution unit computing apparatus, the method comprising:
receiving in memory on a first computing apparatus, a computer program comprising a set of operations and at least one loop nest, the first computing apparatus comprising the memory and a processor;
transforming the computer program for execution on a second computing apparatus, the second computing apparatus comprising at least one main memory, at least one local memory, and at least one computation unit, each computation unit comprising at least one private memory region, the transformation comprising:
producing a tiled variant of the computer program;
generating operations to perform data movements for elements produced and consumed by tiles between the at least one main memory and the at least one local memory;
optimizing the operations to perform the data movements to reduce communication cost and memory traffic by eliminating redundant transfers based on placement functions and dependence information of the operations within the tiles; and
producing an optimized computer program for execution on the second computing apparatus,
wherein the redundant transfers elimination includes: a value stored in a local memory location addressable by at least two processing elements in the at least one computation unit is reused to replace one of the redundant transfers of the value from the at least one main memory to the at least one local memory.
|