| CPC G06F 9/4856 (2013.01) | 9 Claims |

|
1. A manycore system comprising:
a plurality of clusters; and
a device memory configured to
receive an offloading instruction from a host device, wherein the offloading instruction is generated by an offloading library of the host device, the offloading instruction includes information on a number of clusters which are selected to be used for offloading, among the plurality of clusters, based on estimation of a job load for each cluster through an existing task allocation state and a waiting queue state of each cluster, the offloading instruction includes information on allocation of a number of memories of the manycore system for offloading, and the offloading instruction does not specify allocation of cores and threads of the plurality of clusters;
store data associated with a job requested to be offloaded from the host device, wherein each of the plurality of clusters includes:
a program memory configured to store a program associated with the job requested to be offloaded;
a plurality of cores configured to execute one or more threads associated with the job; and
a management module configured to receive a plurality of tasks included in the job; and
allocate one or more tasks among the received plurality of tasks to the plurality of cores based on job loads of all core and thread areas in the cluster, and control thread execution corresponding to the one or more allocations,
each of the plurality of cores includes a plurality of thread areas configured to independently store and track an execution state of each of the one or more threads executed on the core, and
each of the one or more threads executed on the core is independently executed using a separate thread area,
wherein each of the plurality of thread areas includes:
a program counter configured to store information on an address of an instruction that each thread is executing; and
a register file configured to store an intermediate value of the operation that each thread is executing.
|