| CPC G06F 8/4452 (2013.01) [G06F 8/40 (2013.01); G06F 8/452 (2013.01); G06F 9/52 (2013.01); G06F 15/17381 (2013.01); G06F 30/30 (2020.01); G06F 30/323 (2020.01); G06F 30/392 (2020.01); G06F 2115/10 (2020.01)] | 2 Claims |

|
1. A hardware system automatically compiled, by a compiler, from a single-threaded software program;
where the single-threaded software program includes a recursive function f, which is defined to be a function that calls itself;
where the recursive function f is implemented within the hardware system by a customized hardware unit comprising a finite state machine;
where a plurality of copies of the customized hardware unit are interconnected by a task network, so that each customized hardware unit on the task network is able to send a task for implementing the recursive function f to any other customized hardware unit on the task network, or receive a task for implementing the recursive function f from any other customized hardware unit on the task network;
where the recursive function f calling itself is implemented within the customized hardware unit as follows:
(i) if the task network is accepting tasks at a point of call, sending a task for implementing the recursive function f over the task network toward another customized hardware unit on the task network; and
(ii) otherwise, if the task network is not accepting tasks at the point of call, performing the recursive function f locally inside the customized hardware unit with an ordinary recursive call, without sending any task to the task network at the point of call;
for purposes of avoiding a deadlock resulting from all customized hardware units on the task network having to wait after the task network gets filled with tasks, ensuring forward progress, and achieving high task parallelism in addition to fine-grain parallelism;
where the hardware system is functionally equivalent to the single-threaded software program that the hardware system is compiled from; and
where design of the hardware system is partitioned and further comprises:
a. a plurality of finite-state machines and a plurality of memory units;
b. networks connected to the plurality of finite-state machines and the plurality of memory units for managing task invocations in hardware and memory data transfers in hardware;
c. a hardware synchronization unit to ensure executing memory instructions in a same order as within the single threaded software program; and
d. a coherent memory hierarchy which signals completion of memory instructions for supporting the hardware synchronization unit.
|